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Abstract 



Recent developments in theoretical linguistics have lead to a widespread 
acceptance of constraint-based analyses of prosodic morphology phenom- 
ena such as truncation, infixation, floating morphemes and reduplication. 
Of these, reduplication is particularly challenging for state-of-the-art com- 
putational morphology, since it involves copying of some part of a phono- 
logical string. In this paper I argue for certain extensions to the one-level 
model of phonology and morphology (Bird & Ellison 1994) to cover the 
computational aspects of prosodic morphology using finite- state methods. 
In a nutshell, enriched lexical representations provide additional automa- 
ton arcs to repeat or skip sounds and also to allow insertion of additional 
material. A kind of resource consciousness is introduced to control this 
additional freedom, distinguishing between producer and consumer arcs. 
The non-finite- state copying aspect of reduphcation is mapped to automata 
intersection, itself a non-finite- state operation. Bounded local optimization 
prunes certain automaton arcs that fail to contribute to linguistic optimi- 
sation criteria such as leftmostness of an infix within the word. The paper 
then presents implemented case studies of Ulwa construct state infixation, 
German hypocoristic truncation and Tagalog overapplying reduplication 
that illustrate the expressive power of this approach, before its merits and 
limitations are discussed and possible extensions are sketched. I conclude 
that the one-level approach to prosodic morphology presents an attractive 
way of extending finite- state techniques to difficult phenomena that hith- 
erto resisted elegant computational analyses. 
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1 Introduction 



Prosodic morphology is the study of natural language phenomena in which the shape of words is to a major 
extent determined by phonological factors such as obedience to wellformed syllable or foot structure, 
adjacency to stress peaks or sonority extrema etc. 'Introductory examples from truncation, infixation, root- 
and-pattem morphology and redupUcation are given in (1). 

(1) a. German: hypocoristic i-tmncation 

Petra > Pet-i, Andrea > And-i, Gorbatschow > Gorb-i 

b. Ulwa: construct state suffixation/infixation 

bas-ka 'his hair', as-ka-na 'his clothes', karas-ka-mak 'his knee' 

c. Modem Hebrew: vowel/0 alternation in nonconcatenative verbal morphology 

gamar 'finished' (3sg.m), gamr-a (3sg.f), 
yi-gmor 'will finish' (3sg.m), yi-gmer-u (3pl) 

d. Mangarayi: reduplicated plural etc. 

gabuji > gababuji 'old people', jimgan > jimgimgan 'knowledgeable people' , 
muygji > muygjuygji 'having a dog' 

Prosodic conditions govern the cutoff point in German hypocoristic truncation, serve to determine the 
placement of the floating -ka- suffix/infix within Ulwa possessive noun constructions, control the deletion 
of vowels in Modern Hebrew nonconcatenative verb morphology and determine position and length of the 
reduplicant copy in Mangarayi plurals. We will see later what some of these conditions are and how to 
devise computational analyses for such phenomena. 

In his extensive overview of the state of the art in computational morphology, Sproat (1992) provides 
ample indication that there is still work to do with regard to these phenomena. Here is a relevant sample of 
Sproat' s connments: 

Subtractive morphology - presumably since it is relatively infrequent - has attracted no 
attention, [p. 170] 

. . . computational models have been only partly successful at analyzing infixes, [p. 50] 
From a computational point of view, one point cannot be overstressed: the copying required 
in reduplication places reduplication in a class apart from all other morphology. [p.60] 
... a morphological analyzer needs to use information about prosodic structure, [p. 170] 
. . . there is a tendency in some quarters of the computational morphology world to trivialize 
the problem, suggesting that the problems of morphology have essentially been solved simply 
because there are now working systems that are capable of doing a great deal of morphological 
analysis . . . On should not be misled by such claims . . . there are still outstanding problems 
and areas which have not received much serious attention . . . [p. 123] 

There is still truth in Sproat's words seven years after they were written. 

The primary goal of this paper is therefore to start answering the challenge posed by Sproat's comments 
and show how central hnguistic insights into the way prosodic morphology works translate into an imple- 
mented finite-state model. The model is named One-Level Prosodic Morphology (OLPM), because 
it was developed as an adaptation of One-Level Phonology (Bird & Ellison 1994, OLP) to prosodic mor- 
phology. Despite some initial attempts, it is probably fair to say that prosodic morphology as a branch of 
theoretical linguistics is still rather underformalized. While Bird & ElUson (1994, §5.4) does contain a very 

' This work has been funded by the German research agency DFG under grant WI 853/4 1. I am particularly indebted to T. Mark 
Ellison, who generously shared his ideas with me during a visit to Krakow. Thanks also to Richard Wiese and Steven Bird for helpful 
comments and encoitfagement. All remaining errors and shortcomings are mine. 
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brief autosegmental analysis of just the Arabic verbal stem kattab - itself a legitimate piece of prosodic 
morphology -, we wiU see that the extensions introduced in this paper can be justified on the basis of a 
broader view of what prosodic morphology encompasses. The model shares the essential assumption of 
monostratality with One-Level Phonology, maintaining the restrictive postulate that there should only be 
one level of linguistic description. It thus inherits two key advantages of the former model, namely easy 
integrability into monostratal frameworks of grammar such as HPSG (Pollard & Sag 1994) and simplified 
machine learning of surface-only generalizations (Ellison 1992). By furthermore retaining the finite-state 
methodology of its predecessor it keeps what is computationally attractive about the former model's prop- 
erties. Combining these essential traits, it is clear that OLPM - like OLP - will be based on finite-state 
automata (FSA) with their characteristic combinatory operation of regular set intersection, also known as 
automaton product. Thus, it contrasts with models employing two, three or more levels (Koskenniemi 1983, 
Touretzky & Wheeler 1990, Chomsky & HaUe 1968), which are usually implemented with finite-state 
transducers (FSTs) and characteristically employ a composition operation to combine individual transduc- 
ers into a single overall mapping. 

The present work differs, however, in not following the specific representational assumptions of nonlin- 
ear autosegmental phonology (Goldsmith 1990) embodied in OLP, using a flat prosodic-segmental repre- 
sentation instead. This difference is not crucial to the task at hand. It mainly stems from our desire to avoid 
an additional layer of complexity that promises little return value for the immediate goal stated above. 
Needless to say, however, a representational variant of OLPM using autosegmental diagrams would not 
seem to be unthinkable. 

There is another difference. Existing research in nonconcatenative finite-state morphology has primar- 
ily been concerned with templatic aspects of Semitic languages (e.g. Kataja & Koskenniemi 1988 for 
Akkadian, Beesley, Buckwalter & Newton 1989, Beesley 1996 and Kiraz 1994, Kiraz 1996 for Arabic 
and Syriac). Alternative (feature-)logical treatments share this phenomenological bias (Bird & Blackburn 
1991 on Arabic, Klein 1993 on Sierra Mi wok, Walther 1997 and Walther 1998 on Tigrinya and Modern 
Hebrew). In contrast, our focus is on the difficult rest of prosodic morphology, where infixation, circum- 
fixation, truncation and in particular reduplication have not received elegant computational analyses so 
far. 

The paper is organized as follows. Section 2 provides some background to the emergence of constraint- 
based models of prosodic morphology, while 3 lays out the range of data to be accounted for. As the 
presentation unfolds, a number of desiderata for formalization and implementation are formulated. The 
central proposals of the paper are contained in the following section 4, where I show which representational 
assumptions must be made and which new devices need to be incorporated into a comprehensive one-level 
model of prosodic morphology. In section 5 the proposals are evaluated in practice by developing detailed 
implementations of Ulwa infixation, German truncation and Tagalog redupUcation phenomena. Section 6 
concludes with a discussion of these proposals, evaluating their merits both on internal grounds and in 
comparison to other works. 

2 Background 

Since the beginning of the seventies it has been recognized that rule-based models of prosodic morphology 
lack explanatory adequacy, a fact that has come to be known as the 'rule conspiracies' problem (Kisseberth 
1970). Kisseberth used the vowel/zero alternation patterns from inflected verb forms in Tonkawa to make 
his point (2). Tonkawa is an extinct Coahuiltecan language with CV(C) syUable structure. 

(2) Tonkawa verb forms 
'to cuf 'to lick' 

#picn-0?# #netl-0?# 3sg.ohj.stem-3sg.subj. 

#we-pcen-o?# #we-ntal-o?# 3pl.obj.-stem-3sg.suhj. 
#picna-n-o?# #netle-n-o?# 3sg.obj.stem-pwg. 
p(i)c(e)n(a) n(e)t(a)l(e) stems 
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While trying to incorporate more and more affixation patterns, Kisseberth observed that the usual 
enlarge-corpus/modify-analysis cycle resulted in increasingly baroque levels of complexity for the vowel 
deletion rules.^ But more to the point, these rules 'conspired' to maintain a very simple, yet global invari- 
ant - sequences of three consecutive consonants are banned on the surface, symbohcally *CCC (the word 
boundary # acts as a consonant). The reader may easily verify that no further deletion of vowels is pos- 
sible in (2) without violating the invariant. Under the rule-based analysis, however, this global condition 
is nowhere expressed directly. According to Kisseberth the failure can be traced back to a defect of the 
derivational paradigm itself: By design each rule only sees the input given to it by a prior rule application 
in the rule composition cascade, being blind to the ultimate output consequences that surface at the end. 
Later developments have decomposed segment-level constraints such as *CCC by referring to the prosodic 
concept of the syllable instead - complex syllable onsets and codas are disallowed in Tonkawa core sylla- 
bles. At least since Kahn (1976) and Selkirk (1982) this idea of an independent level of syllable structure 
superseded the SPE (Chomsky (& Halle 1968) conception of purely segmental strings in the generative liter- 
ature. While the trend towards representationalism in phonology, marked by the advent of such frameworks 
as Autosegmental and Metrical Phonology (see Goldsmith 1990 for an overview), reduced the reUance on 
rules and further strengthened the role of prosodic structure in actual analyses, the fundamental defect that 
Kisseberth and others had recognized in derivational theories still awaited a principled solution. 

That solution emerged at the beginning of the nineties, as constraint-based models of phonology were 
proposed to directly capture the missing 'output orientation' that plagued its derivational predecessors. Bird 
(1990) and Scobbie (1991) were among the first to use monotonic formal description languages to express 
surface-true non-violable constraints, defining what has now become known collectively as Declarative 
Phonology (DP; Scobbie, Coleman & Bird 1996). In DP, both lexical items and more abstract generaliza- 
tions are constraints, with constraint conjunction being the characteristic device for formalizing constraint 
interaction. Shortly thereafter Prince & Smolensky (1993) argued for ranked violable constraints instead. 
Their proposal was named Optimal! ty Theory (OT) and has since become a much-recognized new paradigm 
in theoretical phonology and beyond. 

In OT constraints seek to capture conflicting universal tendencies while strict ranking imposes an ex- 
trinsic ordering relation on the set of constraints, expressing which one takes precedence for purposes of 
conflict resolution. According to the OT ideal, languages differ only in how they rank the common pool of 
constraints. Strictness of ranking means that, in contrast to arbitrarily weighted grammars, no amount of 
positive wellformedness of an input with respect to lower-ranked constraints can compensate for illformed- 
ness due to a higher-ranked constraint. Finally, although constraints may be gradiently violated by the set 
of structurally enriched candidates that is generated from the input, only candidates with the minimal num- 
ber of violations are designated as grammatical. Note that, because of this powerful mechanism of global 
optimization, the OT analyst is free to propose constraints that are never surface-true (an example would 
be the excessive-structure-minimizer *STRUC 'Avoid structure'. Prince & Smolensky 1993, ch.3, fn.l3; 
see Walther 1996, 13 for a formalization). 

While DP paid considerable attention to proper formalization of phonology, the empirical domain of 
prosodic morphology so far has received much less attention than in OT, where the co-appearance of 
McCarthy & Prince (1993) with Prince & Smolensky (1993) marked the beginning of a continuous in- 
volvement with the subject. In particular, certain problems in the prosodic morphology of reduphcation 
motivated an extension to classical OT known as Correspondence Theory (McCarthy & Prince 1995). Here 
constraints are allowed to simultaneously refer to both levels of two-level pairings for assessing gradient 
wellformedness, in what appears to be rather analogous to two-level morphology (Koskenniemi 1983). 
However, the range of correspondence-theoretic mappings - mediated by some abstract indexation scheme 
- goes beyond Koskenniemi's original framework in that it includes intra-level instances such as base- 
reduplicant correspondence with the same word level in addition to classical cross-level mappings like 
so-called input-output (i.e lexical-surface) correspondence. 

Despite this growing body of theoretical work the average level of formalization in concrete OT anal- 
yses has been rather low^, which is a genuine problem in the light of Chomsky (1965, 4)'s definition of 



^Witness e.g. the rewrite rule 



V 
+stem 



V + C 



V 
+stem 



from Kisseberth (1970). 



'Notable exceptions include Albro (1997) on Turkish vowel harmony, Eisner (1997b) on stress systems, Ellison (1994b) on Arabic 
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a generative grammar as one that must be "perfectiy explicit" and "not rely on the intelUgence of the 
understanding reader"."* 

Note that this is not due to a lack of proposals for formaUzing the abstract paradigm of (classical) OT 
itself, where Ellison (1994b), Walther (1996), Eisner (1997a), Frank & Satta (1998), and Karttunen (1998) 
have all made contributions within the framework of finite-state systems. Karttunen's work is especially 
worth mentioning, because he was able to show that, under regularity assumptions for both the constraints 
and the input set, optimal outputs can be computed using only established finite-state operations such 
as transducer composition and automaton complement, without any overt optimization component. The 
impact of his results is that it places SPE-style rule cascades and OT-style constraint rankings into rather 
close proximity. Both represent ways to define finite-state transducers, albeit with very different high-level 
specification languages. Again the assumption of an underlying finite-state architecture was the key factor 
in establishing Karttunen's findings. 

Given the state of the field outlined above, it seems particularly attractive now to combine these separate 
sttands of research. More precisely, the desire is to produce a model that can (i) capture theoretical insights 
into the analytical requirements of prosodic morphology without (ii) unduly compromising in the area of 
proper formalization while (iii) still ensuring effective computability with the help of finite-state methods. 
To this end 1 chose to extend One-Level Phonology, itself a finite-state incarnation of DP, than fleshing out 
one of the proposals for OT. 

The principal reason for this choice has to do with the fact that there are much better prospects for 
automatic constraint acquisition than for its two-level competitors. Corpora usually contain the surface 
phonological form of words only and do not come equipped with pairs of surface and abstract underly- 
ing representations (SRs and URs). Theoretical linguistics offers no help either, as e.g. Kenstowicz (1994, 
§3.4) argues at length that no a priori resttiction on possible URs suffices for all scenarios. The OT notion 
of 'lexicon optimization' (Prince & Smolensky 1993), while meant to address the same problem of deriv- 
ing suitable URs, is still too vague to merit closer attention. This lack of either a principled or a natural, 
non-handcrafted source for two-level pairings means that there may be an arbittarily large gap to bridge 
when trying to infer a finite-state mapping SR <-» UR from surface-only data. It is thus no accident that e.g. 
the results of Ellison (1992) on learning a number of phonological properties in a typologically balanced 
sample of 30 languages were obtained by using one-level FSAs for the representation of inferred gener- 
aUzations. Other results from the literature on machine learning of natural language seem to confirm this 
key advantage of monostratality (e.g. Belz 1998). For OT, on the other hand, no substantial result is known 
that adresses the hard problem of constraint acquisition. The prospects of remedying this situation without 
giving up elementary OT premises are not particularly good. As noted above, constraints are formally un- 
restricted and not required to be surface true for at least some pieces of data. Existing results on OT-based 
learning only deal with the much simpler problem of inferring the ranking of constraints given pairs of 
structurally annotated outputs and inputs (Tesar & Smolensky 1993, Boersma 1998, Boersma & Hayes 
1999). This is probably also due to the fact that orthodox OT itself expresses disinterest in the question by 
assuming that all constraints are already given as part of Universal Grammar.^ 

A second reason for extending OLP is that the lack of arbitrary mapping between levels plus the lack of 
global optimization forces a healthy reexamination of existing analytical devices. Maintaining the restric- 
tive set of assumptions embodied in OLP often leads to the discovery of new surface-true generahzations. 
Starting with the one-level approach is more illuminating for investigating the precise nature of the trade- 
off between mono- or polystratal analyses. Finally, this point of departure promises better answers to the 
question of which set of theory extensions is absolutely necessary in order to cover the enlarged range of 
empirical phenomena under study. In what follows we wiU see that this approach of 'starting small' indeed 
yields some of these expected payoffs. 

glottal stop distribution, Walther (1996) on truncated plurals in Upper Hessian. 

''Cf. also the verdict of Pierrehumbert & Nair (1996, 537), who write: "Any attempt to argue for a particular method of combining 
constraints without simultaneously formalizing the constraints is technically incoherent." 

'But see Ellison (to appear) for strong arguments against the universaUst interpretation of OT and Hayes (1999) for initial attempts 
at phonetically grounded constraint induction. 
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3 The Problem 



In this section, I want to give a brief survey of some of the challenges of prosodic morphology. The focus 
will be on what the relevant linguistic data suggest about minimal requirements for formalization and im- 
plementation. The topics to be presented in turn will be reduplication, discontiguity and partial realization 
of morphemes as well as cases of 'floating', i.e. positionally underspecified morphemes together with their 
directional behaviour. 

3.1 Reduplication 

Let us begin with some terminology. The original part of a word from which reduplication copies will be 
called the base, while the copy is also referred to as the reduplicant. The next bits of terminology arise as 
we further classify reduplicative constructions below. The impetus behind this classification is that it rules 
out some easy ways of avoiding the full complexity of the reduplication problem. 

One such classificatory subdivision is between total redupUcations and partial ones. The former are 
defined to be isomorphic to the formal language ww, known to be context-sensitive, while the latter exhibit 
imperfect copying of one sort or another. A frequent case of imperfection has a truncated portion of the 
base as the redupUcant. Furthermore, there are unmodified and modified reduplications, where in the latter 
case reduplicant and base differ in the applicability of phonological alternations. In contrast to exfixing 
reduplications, where the reduplicant is a prefix or suffix, in the infixing variant the reduplicant interrupts the 
immediate adjacency relationships of the base. Also, there are discontiguous redupUcants in addition to the 
more usual contiguous ones. Figuratively speaking, discontiguity means that some segments of the base are 
skipped over in constructing the reduplicant. Whereas some of the most well-known reduplication instances 
like Indonesian plural are unbounded in the sense that the reduplicant length is a Unear function of the length 
of the base, in bounded types of reduplication a finite, and often rather small, upper bound can be placed 
on the length of the redupUcant. Finally, there are purely reduplicative constructions versus their fixed 
melody-ennched counterparts. The former have reduplicants which are entirely constructed from copied 
base material (possible modified in the above sense), whereas the latter also contain base-independent 
segmental material as part of the construction. 

Table (3) shows how constructions from four languages instantiate the classificational scheme outiined 
above, illustrating each opposition with at least one construction. Actual examples are supplied in (4) - 
(7). Reduplicants are marked with bold face and subscripts mark base-reduplicant correspondences where 
necessary. 
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The first example from Madurese (Malayo-Polynesian) shows a case of unbounded total redupUcation 
(4): a rather familiar type that needs no further comment here. 
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(4) Madurese plural (Stevens 1968, 35) 



sakola?an-sakola?an 'schools' 
panladhin-panladhin 'servants' 
panokor-panokor(-ra) '(his) razors' 



Mokilese (Micronesian) illustrates the next case (5), namely prefixed reduplicants that consists of a 
partial copy of the base. Since the number of segments varies as a function of the base, simple templatic 
generaUzations seem not to be available here. 



(5) Mokilese progressive (B levins 1996) 



podok 

nikid 

wia [wija] 

soorok 

onop 

andip 

uruur 



podpDdok 'plant/ing' 



niknikid 
wiiwia 

SDDSDOrok 

onnonop 
andandip 
urruruur 



'save/ing' 

'do/ing' 

'tear/ing' 

'prepare/ing' 

'spit/ing' 

'laugh/ing' 



Nisgha (Salish), shown in (6), differs from Mokilese in that there may be phonological modifications 
in the reduplicant that do not affect their correspondence partners in the base. 

(6) Nisgha CVC prefixing reduplication (Shaw 1987) 

a. maik^s-k*" miis2-miaik^S2-k'" 'be white' 

b. lilk'" liux^-liilk^ 'to lace (shoes)' 

c. q6:?os qias2-qi6:?0S2 'to be cooked' 



In (6).b we can see that spirantization has turned the velar stop /k™/ into a velar fricative /x™/. Further- 
more, the vowel quality of the fixed-size CVC reduplicant is not copied from the base, but constitutes a 
fixed melody part instead.* As it is only the first and last segment of the base that is copied, the reduplicant 
is discontiguous as well. 

Finally, Koasati (Muskogean) has a infixing reduplicative construction depicted in (7), where the base- 
initial segment is copied to the interior of the base and followed by a fixed-melody element /o(:)/. 



(7) Koasati infixing aspectual reduplication (Kimball 1988) 
base punctual 

a. tahaspin tiahas-ti6:-pin 'to be light in weight' 

b. lapatkin liapat-li6:-kin 'to be narrow 

c. aklatlin aik-hio-latUn 'to be loose' 

d. okcakkon oik-hio-cakkon 'to be green or blue' 



The facts are further complicated by the need to distinguish between consonantal left edges and vocalic 
ones, where in the latter case apparently Ihl - the voiceless equivalent of a vowel - serves as the modified 
copy. 

^Actually, it may be said to be semi-fixed, since apparently reduplicant vowel quality is determined by the flanking consonants. 
The claim to a fixed-melody construction remains valid, however, because the redupUcant vowel is not identical to the corresponding 
base vowel. 
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Two further remarks on properties of reduplication seem appropriate at this point. First, there are cases 
like redupUcation in Chumash (8) which show that copying must be phonological - it is not in general 
sufficient to just repeat a morpheme! 

(8) Chumash (Hokan) (Applegate 1976) 

s-RED-ikuk sik-sikuk 3sg-cont.-chop,hack 

s-is-RED-expec s-isex-sexpec Spl-dual-cont.-sing 



Note that the copy includes not only some initial portion of the verb stem but also the last consonant of 
the immediately preceding affix, irrespective of its precise morphological function. 

Second, there are reduplicants whose shape cannot be statically determined as a function of the base, 
but which crucially require knowledge of the reduplicant's eventual surface position. A case in point is 
again represented by the Koasati construction from (7), where the reduplicant has a long vowel when 
forced to occupy the stressed (penultimate syllable) position in the word, but exhibits a short-vowelled 
allomorph when landing elsewhere. In fact, because of stress shifts in comparison with base-only words 
(7).a,b, base stress is a poor predictor of reduplicant position. The lesson of Koasati is that independent 
lexical precomputation of the shape of morphemes entering into a reduplicative construction cannot account 
for all cases. 

Let us now step back from the individual cases and ask some obvious questions: What do these ex- 
amples really tell us? How can the classification that captures their essential dimensions of variation help 
derive necessary and sufficient properties of a linguistically adequate one-level account of reduplication? 

First of all, reduplications of the partial, modified, infixing, discontiguous and fixed melody type show 
that the model depicted in figure (9). a below is insufficient. 



(9) Some problematic reduplication models 



lexical 
tape 



finite-state transducer 

for phonology/morphology/ 

lexicon 



copy device 



surface 
tape 



lexical 
tape 



finite-State transducer 

for phonology/morphology/ 

lexicon 






copy device 





finite-state transducer 
for post-copy phonology 



surface 
tape 



The depicted attempt to construct a hybrid model consisting of a strictly finite-state part driving a 
separate copy device fails on instances of imperfect copying if the copy device is limited to the simplest 
and most efficiently realizable task of mapping input w to output ww. Of course one might devise more 
sophisticated variants of such a device (see appendix A for discussion of a concrete proposal). However, 
hybrid models of this kind still suffer from two rather obvious drawbacks. First they constitute heterogenous 
combinations of a finite-state device with a non-finite state component with unclear overall properties. 
Second, they are more or less generation-oriented, not inherently reversable system. 

The modified type of redupUcation shows that sometimes it is not enough to copy object-level segments 
from within a word's surface realization, but that a more abstract notion of identity needs to be captured. 
While it makes no sense to place the copy device before the lexical transducer, in a multi-level approach 
one might envision to place the copy device in the sandwiched position of a 2-transducer cascade (9).b. One 
could then copy at the level where object identity still holds and carry out the necessary modifications in the 
post-copy part. Unfortunately the aforementioned problems of hybrid models carry over to this setting as 
well. With only one level, OLPM will of course have to employ rather different means to model modified 
reduplications. 

Exfixing types of reduplication allow one to leave base linearizations untouched; no computational 
means for infixation need to be provided. As a further consequence, such constructions in principle can 
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support extensive arc sharing through the coexistence of simple and redupUcated reaUzations in the same 
finite-state network. In contrast, the infixing type with its disruption of base linearizations a priori permits 
no such memory efficiency and needs formal means to support infixation in the first place. 

Discontiguity in general poses the problem of how to model the absence of some stretch of segmental 
material in specific reduplicative constructions when the same material is required to be present in all 
other realizations of bases. Moreover, the greater the length of the skipped-over substring of the base in 
discontiguous reduplicants, the more acute is the problem of handling even bounded-length variants (6) of 
such long-distance dependencies with finite-state automata (Sproat 1992, 91; Beesley 1998). 

The bounded types of reduplications are special in that, given the assumptions of full-form lexica and 
sufficient storage space, they could, in principle, be precompiled into a finite-state network. In practice, 
however, this can yield prohibitively large networks for realistic fragments of natural language morpholo- 
gies (Sproat 1992, 161). Also, it has been remarked that finite-state storage is very inefficient when it comes 
to simulating even the fixed amount of global memory needed to remember the copied portion of a base 
(Kornai 1996). In contrast, the unbounded type cannot be modelled by precompilation at all. No artificially 
imposed upper bound will do justice to the facts in a total-reduplication language like Bambara (Northwest- 
ern Mande) where facts such as wulu-o-wulu 'whichever dog', wulunyinina-o-wulunyinina 'whichever dog 
searcher', wulunyininafilela-o-wulunyininafilela 'whoever watches dog searchers', etc. can be arbitrarily 
extended, as Culy (1985)'s careful investigation shows. Positing separate models for the two types would 
seem to be highly problematic, since bounded and unbounded reduplications may occur in the same lan- 
guage. For example, Madurese, whose total reduplication appeared in (4), also has bounded final-syllable 
reduplication: trc-cstrc 'wives', bu-sanbu 'something increased', wa-buwa 'fruits'. 

3.2 Discontiguity 

It is a well-known fact that morphology is not always concatenative. Rather, various patterns of morpheme 
"overlap" can be observed, as shown in (10). 

(10) Patterns of Morpheme Overlap 



Morphemes 


[overlap] [inclusion] 


Language 


t e i 1 bar 




German 


1 1 1 1 


|S ana ^ 

hi t 

1 1 


+ 


Mod. Hebrew 


b a s a 

1 1 


+ + 


Tagalog 



In these diagrams, we depict the surface extent of a morpheme as the temporal interval between its 
first and last segment.^ Hence, overlap between two morpheme intervals implies that at least one of the 
ordering relationships between intramorphemic segments must be weakened from immediate to transitive 
precedence: the hallmark of a discontiguous morpheme. 

While German teil-bar 'divisible' is a typical instance of purely concatenative arrangement of mor- 
phemes, the next example from Modem Hebrew illustrates a first case of deviation from the concatenative 
ideal, namely morpheme-edge metathesis. Here the conjugation class prefix /hit-/ (hitpael binyan), which 
ends in coronal Ixl, partially overlaps verbal stems whose first segment is a coronal obstruent. Note that hit- 

^This assumes an understanding of morphemes (and words) as totally ordered sets of segments. Suprasegmental morphemes like 
'nasalize word till first obstruent' (cf. the Arawakan language Terena, Bendor-Samuel 1960) need a generalization that refers to 
observable phonological effects instead of segments. As far as I know, cases of improper inclusion or total overlap always correspond 
to such suprasegmentals, too, and can be exemplified by phenomena like nasalization, pharyngealization, tone marking etc. 
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and coronal-initial verbal roots are only discontiguous if cooccurring (Ihijtana 'changed (intr.)', *hit-Jaiia, 
butme-fane 'change (pres.)' and hit-gala 'appear', *higtala, etc.). 

In Tagalog infixation, the deviation is more severe, as the actor-trigger morpheme -um- is totally over- 
lapped by the verb stem in b-um-asa 'read' . In the dual case of circumfixation, also attested in Tagalog 
(e.g. ka-an, as in ka-bukir-an 'fields'), affix and stem have simply changed roles in what amounts to the 
same pattern of total overlap. It is worth pointing out that Tagalog still has its share of purely concatenative 
morphology, even involving the same stems (e.g. makd-basa 'read to somebody (by chance)'). 

Even if a language allows discontiguity in regular morphology, it may nevertheless exhibit exceptional 
morphemes that forbid intrusion into their own material. For example, in Ulwa construct-state morphology 
(l).b there are noun stems like kililih 'cicada' where infixation *kili-ka-lih would be predicted, but only 
suffixation kililih-ka is possible. It must therefore be possible to control infixability in the Ulwa lexicon. 

With so much focus on discontiguity, we should tackle a potential objection. Is productive discon- 
tiguous morphology perhaps limited to so-called 'exotic' languages? English, thought otherwise to be 
purely concatenative, allows us to give a negative answer to that question. The language has a productive 
process of expletive insertion which readily creates words Uke Kalama- goddam-zoo, in-fuckin-stantiate, 
kanga-bloody-i'o, in-fuckin-possible, guaran-friggin-tee (Katamba 1993, 45), thereby breaking up stems 
that appear elsewhere as contiguous. 

The challenge of discontiguity then is to come up with a generic formal solution that is ideally able 
to represent morphemes in a uniform manner, regardless of whether discontiguity is prominent, rare or 
nonexisting in a language. It should also capture the fact that immediate precedence of intramorphemic 
segments appears to be the default, giving way to transitive precedence only if immediate adjacency leads to 
ungrammaticality. Finally, the solution should allow for cases where discontiguity is either grammatically 
or lexically forbidden. 

3.3 Partiality 

Sometimes morphemes do not realize all their segmental material. We have already seen that in Tonkawa 
and Modern Hebrew, certain stem vowels are omitted by way of regular processes, depending on the affixa- 
tion pattern. However, in these and many similar cases the number of potentially zero-alternating segments 
is strictly predictable. In Tonkawa, at most every stem vowel (and /h/, phonetically a voiceless vowel) can 
loose one mora (V, /h/ 0, VV V), whereas in Modern Hebrew there is a maximum of two alternat- 
ing stem vowels per verb form. Furthermore, since in these languages omittable vowels are intercalated 
with stable consonants, the length of contiguous stretches of deletable material is also bounded by a small 
constant, often 1. Besides this type of bounded partial realization, however, there is a type of potentially 
unbounded partial realization, a case of which we have seen in productive German i-truncation (l).a. Here, 
the length of the deleted string suffix of a base noun is a linear function of its original length and therefore 
in principle unbounded (2, 3 and 5 segments in [pc:t*ai], [Tan duciai ], [goKb atJofi ]). While it is true that 
Standard High German does not use truncation for ordinary grammatical processes such as pluralization,^ 
other languages hke Tohona O'odham, Alabama, Choctaw and Koasati employ truncation for exactly this 
purpose (Anderson 1992, 65f). 

Summing up, some natural desiderata for a generic formalization of partiality would be to allow for 
morpheme representations where a priori no segmental position must be reahzed, to provide for flexible 
control over actual realization patterns, and to account easily for the frequent case where no part of a 
morpheme is omitted. 

3.4 Floating Morphemes and Directionality 

The inherent assumption of the continuation-classes approach to morphotactics (Koskenniemi 1983) is that 

morphemes are always tied to a fixed position in the usual chain of affixes and stems that make up a word. 
However, clear counter-cases of so-called floating morphemes do exist, for example, in Huave (Huavean) 
and Afar (East Cushitic). In Afar (11), the same affix may flip between prefixal and suffixal position, 
depending on the phonology of the stem (Noyer 1993). 

^However, some German dialects do have limited truncation for plurals, e.g. Upper Hessian bjnd 'dog (sg.)' — > hon '(pi.)' (Colston 
& Wiese 1996). 
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(1 1) Second Person Floating Affix in Afar 

a. t-okm-e b. yab-t-a c. ab-t-e 

2-eat-perf speak-2-impf do-2-perf 

'you ate' 'you speak' 'you did' 



Descriptively, the second-person affix "t- occurs as a prefix before a nonlow [stem-initial, M.W.] vowel 
and elsewhere as a suffix"(Noyer 1993). 

The placement of variable-position infixes has been analyzed in OT by attributing to them an inherent, 
directional 'drift' towards the left or right edge of some constituent such as the word. Analyses using this 
idea typically employ additional affix-independent constraints expressing e.g. prosodic wellformedness to 
control the ultimate deviance from the desired position. OT's development of constraint-based direction- 
ality was originally illustrated with the Tagalog -um- infixation case that we briefly mentioned above. It 
led to the introduction of the EDGEMOST constraint (Prince & Smolensky 1993) and later to a family of 
generalized alignment constraints (McCarthy & Prince 1994, et seq) that seek to minimize the distance to 
some designated edge. 

Adapting the gist of the idea for our present purposes, one could analyze Tagalog -um- by assuming 
leftward drift for the infix while simultaneously imposing a specific prosodic wellformedness condition on 
surface words: syllables must have onsets. One advantage of this analysis is that it explains why lexically 
onsetless affixes cannot become prefixes: *.um.-ba.sa., \b-u.m-a.sa.^ 

We may proceed similarly for the Afar case. This time the leftward drift of (-)t- needs to be coupled 
with an affix-specific conditional constraint that demands a nonlow vowel to the right if landing in word- 
initial position. In addition, some formal way of ruling out discontiguous stems is needed to prevent infixed 
-t-. Note that if the second person affix cannot become a surface prefix, it indeed lands on the first position 
after the stem, thereby underscoring the leftward-drifting behaviour attributed to t-. 

For motivation of rightward drift, let us look at some data from Nakanai (12), an Austronesian language 
cited in Hoeksema & Janda (1988, 213f). 



(12) Nakanai suffixing VC reduplication 



haro 
velo 
baharu 

abi 

kaiamo 



har-ar-o 

vel-el-o 

bahar-ar-u 

ab-ab-i 

kaiam-am-o 



'days' 

'bubbling forth' 
'windows' 

'getting' 

'residents of K. village' 



Besides technical devices to model total substring redupUcation XY — > XiYj — XiYj and a VC 
requirement for the reduplicant's segmental content, a minimalist analysis only needs to add rightward- 
drifting behaviour for the reduphcant. The reader will find it easy to verify that the forms in the second 
column of (12) indeed optimally satisfy both the drift specification and the VC requirement, whereas any 
drift further to the right would violate the latter requirement: *haro-ro shows faithful copying and perfect 
suffixation, but has a CV reduplicant." 

'This analysis deviates from Prince & Smolensky's original treatment, which admitted onsetless um- before roots like oral 'teach'. 
Their analysis is rejected in Boersma (1998, 198). He points out that orthographically vowel-initial roots are actually pronounced with 
a leading glottal stop (.?-u.m-a.ra/.), while forms like .mag.-?d.ral. 'study', with a proper prefix, show that this glottal stop is better 
assumed to be part of lexical representation. To prevent intrusion of -um- into complex-onset roots like gmclwet, we may either 
assume that infixal /m/ is prespecified to become a syllable onset or make sure that immediately adjacency in complex onset members 
is inviolable lexical information: *.g-um.-rad.wet., \.gr-u.m-ad.wet. 'graduate'. 

'"interestingly, the symmetrical case of leftward-drifting VC reduplication with otherwise identical conditions can be found in the 
Sahsh language Lushootseed (Hoeksema & Janda 1988, 214): stiibj > st-ub-ubj 'man', ?ibac > ?-ib-ibac 'grandchildren'. 

"One objection to the analysis just sketched might be that one could (perhaps generally) eliminate drift at the expense of prosodic 
subcategorization. In Nakanai - at least for the data in (12) - this would involve no more than the reduplicant's requirement for 
adjacency to a word-final syllable nucleus. We will discuss the nature of the tradeoff involved in choosing between conventional 
versus drift-based analyses in more detail in §5. 
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A seemingly different kind of directionality is involved in choosing grammatical vowel/0 realization 
patterns in the bounded-partiahty morphologies of Modern Hebrew, Tonkawa and others. To account for the 
kind of left-to-right preference found in these patterns, Walther (1997)has proposed a so-called Incremental 
Optimization Principle, "Omit zero-alternating segments as early as possible". The principle explains, inter 
alia, why in Tonkawa we-pcen-o? is grammatical (cf . (2)) but * we-picn-o?,* we-picen-o? are not. The latter 
two represent a missed chance to leave out /i/, which appears earlier in the speech stream than the second 
stem vowel /e/. Prosodic wellformedness alone does not distinguish between these forms since none of 
them violates the CV(C) syllable constraints. 

To sum up, floating morphemes and the different kinds of directionality posit yet another chaUenge for 
formalization. The question here is how to express drift in lexical representations and whether it is possible 
to unify the seemingly diverse kinds of directionality within a single formal mechanism. 

Having provided linguistic motivation for certain abstract requirements for formalization of prosodic 
morphology, we now turn to our central proposals that meet these requirements in the context of a finite- 
state model. 

4 Extending Finite-State Methods to 
One-Level Prosodic Morphology 

In this section 1 will present the extensions to OLP that are deemed necessary to formalize the range 
of prosodic morphology phenomena described above. A basic knowledge of formal languages, regular 
expressions and automata will be helpful in what follows, perhaps at the level of introductory textbooks on 
the subject (Hopcroft & Ulhnan 1979, Partee, ter Meulen & Wall 1990). 

4.1 Technical Preliminaries 

Bird & ElUson (1994, §3) proposed state-labelled automata as the formal basis for autosegmental phonol- 
ogy. In the present work, however, 1 will return to more conventional arc-labelled automata. The main 
reason for this choice is that it eases actual computer implementation, since existing FSA toolkits aU rest 
on the arc-label assumption (van Noord 1997, Mohri, Pereira & Riley 1998). As Bird & Ellison themselves 
note, the choice has no theoretical consequences because the two automata types are equivalent (Moore- 
Mealy machine equivalence). 

With respect to the actual content of the arcs, however, I will follow OLP by allowing sets as arc labels. 
Eisner (1997a), who employs the same idea for an implementation of OT-type phonological constraints, 
argues that this helps keep constraints small. The reason why we may gain a more compact encoding of 
automata for phonological and morphological purposes is because boolean combinations of finitely-valued 
features can be stored as a set on just one arc, rather than being multiplied out as a disjunctive collection of 
arcs. Again it must be emphasized that the choice is not crucial from a theoretical point of view, but simply 
convenient for actual granmiar development. 

Of course, sets-as-arc-labels require modifications to the implementation of various standard operations 
on finite-state automata. Our representation for these sets is in the form of bit vectors. For FSA intersection 
A (1 B this impUes that the identity test (A.arci = B.arcj) between two arc labels must be replaced 
with a refined notion of arc compatibility (A.arci n B.arcj ^ 0), which is efficiently implementable with 
bitwise logical AND and test-for-nonzero instructions. While FSA concatenation, union and reversal have 
operationalizations that are independent of the nature of arc symbols, forming the complement of a FSA 
involves determinization and completion, two operations which again require modification. Recall that a 
complete automaton is one that has a transition from every state for each element of the automaton alphabet. 

Completion can be implemented for set-labelled automata by creating a new nonfinal state sink and 
adding for every state s ^ sink an extra arc pointing at sink. This arc is labelled with the universal al- 
phabet set S minus the union of all sets that label the outgoing arcs of state s. Again, bitwise operations 
for complement and logical OR can be used here. Whereas for completion a direct realization seems pre- 
ferrable, one way to realize FSA determinization instead involves a reduction to the conventional, identity- 
based version. It capitalizes on the insight that non-empty label intersection gives the same results as label 
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identity test iff all label sets bear the property of being either pairwise identical or disjoint. By replacing 
existing arcs with disjunctive arcs in such a way that this property is met in the entire automaton, a new 
automaton can be created that is then subject to conventional determinization, i.e. with bit vectors reinter- 
preted as simple integers. Hence, this scheme allows the reuse of existing, optimized software. An optional 
post-determinization step could then fold multiple disjunctive arcs cormecting any two states back into a 
single arc bearing a disjunctive label. Of course, nothing precludes more efficient schemes which would 
presumably require somewhat more extensive modification of classical determinization algorithms to make 
the implementation efficient for bit vector arc labels. 

Finally, FSA mimization is an important operation when it comes to the compact presentation of anal- 
ysis results and also when memory efficiency is crucial. However, because reversal and determinization 
suffice to implement minimization (Brzozowski 1962), no further modifications are necessary to support 
set-based arc labels. 

4.2 Enriched Representations 

This section describes in detail some generic enrichments to conventional finite-state representation in the 
OLPM model, motivating in each case why inclusion of the proposed representational element is warranted 
for one-level analyses of prosodic morphology. 

In what follows we will abbreviate set-based arc labels with mnemonic symbols for reasons of readibil- 
ity. Also, we will hberally apply set-theoretic notions to symbols, speaking e.g. of disjoint symbols when 
we really mean that the sets denoted by those symbols must be disjoint. 

As it is actually done in the implementation, we will furthermore conceive of those symbols as types 
that are organized into a type hierarchy, allowing the grammar writer to express both multiple type in- 
heritance and type disjointness. However, the syntactic details will be suppressed in the text; rather, we 
will describe the essential parts of the type signature in prose for the sake of clarity. The semantics of 
the type system assumed here is extremely simple: the denotation of a parent type in the directed acyclic 
graph that constitutes a type inheritance hierarchy is defined as the union of the denotation of its children, 
whereas a type node without children denotes a unique singleton set (cf. Ait-Kaci, Boyer, Lincoln & Nasr 
1989). Complex type formulae are permitted, using the Boolean connectives & ; ~ for logical AND (in- 
tersection), OR (union) and NOT (complement), respectively. As mentioned before, all type formulae are 
ultimately represented by bit vectors. 

Let us proceed to the first enrichment, which is a preparatory step for reduplication. It is a definining 
characteristic of redupUcation that it repeats some part of a string and it would be nice if this property 
could somehow be encoded explicitly. Intuitively, in one sense repetition is just about moving backwards 
in time during the process of spelling out string symbols. (We momentarily disregard the second aspect 
of repetition, that of ensuring proper identity of symbols). Therefore, in an initial attempt to flesh out 
this intuition one could add designated 'backjump' arcs to an automaton. Backjump arcs S '"-^^^ would 
directly connect every state S to all of its predecessor states Pi. A predecessor state Pi is defined as follows: 
Pi Ues on a path p leading from a start state to S and there exists a non-empty proper subpath of p beginning 
at Pi . The example in (13) shows the automaton for a Bambara word malo 'rice' . 
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(13) Backjumps: an initial attempt at repetition 




backjump 



However, this potential solution has several shortcomings. It is not linear in the length of the string, 
adding n(n + l)/2 arcs for a string of length n. Given a single backjump symbol - which is desirable for 
reasons of uniform 'navigation' within a network -, it introduces too much nondeterminism, because at 
string position i,0 < i < n there is a choice between i possible backjumps. This nondeterminism in turn 
makes it cumbersome to specify a fixed-length backjump, a specification which would not be unusual in 
linguistic applications. Finally, we would require an unbounded memory of already produced states if lazy 
incremental generation of string arcs was desired for an efficient implementation. 

As it turns out, however, we really need only a proper subset of the backjumps anyway, because with 
the kleene-star operator we already have sufficient means in our finite-state calculus to express iterated 
concatenation, i.e. nonlocal arc traversal. A better solution therefore breaks down nonlocal backjumping 
into a chain of local single-symbol backjumps. It consists of adding a reverse repeat arc j ^ i labelled 
with a new technical symbol repeat for every pair of states <i, j> connected by at least one content arc 
i — > j. Content arcs are defined as arcs labelled with segmental and other properly linguistic information. 
Furthermore, content symbols must be disjoint from technical symbols. (14) has an example showing the 
Bambara word for rice under the new encoding. 

(14) Repeat arcs: the final version 




To give a preview of how this can be used in redupUcation: the regular expression 

segment* repeat repeat repeat repeat segment* 

describes an automaton which - when intersected with (14) - will yield a string that contains two instances 
of malo separated by four repeat symbols. The type segment used in the previous expression abbreviates 
the union of all defined segmental content symbols. 
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Summing up then, the repeat-arc solution is better than the previous one, because the additional amount 
of repeat arcs is linear in the length of the string, because there is no nondeterminacy in jumping backwards, 
because it is now trivial to add a local repeat arc upon lazy generation of a content arc, and because it has 
become easy to jump backwards a fixed amount of k segmental positions by way of the regular expression 
repeat'^. 

The reader will have noted, however, that repeat arcs change the language recognized by the respec- 
tive automaton, in particular by rendering it infinite through the introduction of a cycle within each arc- 
cormected state pair <i,j>. Also, by referring to states, repeat-arc introduction was defined as manip- 
ulation of a concrete automaton representation rather than with respect to the language denoted by the 
automaton.'^ Obviously, the minimization properties of automata containing repeat arcs are rather differ- 
ent from those of repeat-free automata as well. I will take up these and other issues related to repeat arcs 
later, when we have seen how the proposed enrichment interacts with intersection and synchronization 
marks to implement reduplication. 

The next enrichment covers discontiguity, with its characteristic property of transitive precedence be- 
tween content symbols. Again, to be maximally generic we need to allow for intervening material at every 
string position, i.e. before and after every content symbol. Also, intervening material may itself be modified 
by other morphological processes, i.e. it may contain technical symbols such as repeat. We therefore add a 
self loop to every state i, i.e. an arc i ^ i labelled with S, the union of content and technical symbols. The 
example this time displays the Tagalog word basa (15). 

(15) Self loops: a representation for discontiguity 




To preview a later use of this representation: when figuring as part of b-um-asa, the self loop pertaining 
to state 1 will absorb the infixal material um. The further issues and consequences that arise from this 
enrichment step are mostly the same as for the previous one, apart from the additional question of what to 
do with unused self loops like those emerging from states 0,2,3,4 in the case of singly-infixed b-um-asa. 
Again, it seems best to treat those issues later on. 

The third enrichment deals with partiality and is especially useful for truncation. Here we want to be 
able to exercise fine control over the amount of material that gets realized or skipped over, and also have 
the option of leaving out an a priori unbounded amount of segmental content. Therefore, the leading idea 
again is to use a local encoding, which consists of adding companion skip arcs S T to all content 
arcs S T. Like its repeat counterpart, skip is defined to be a new technical symbol. Example 

(16) illustrates how the automaton representation of German Petra 'proper (first) name' looks Uke after the 
enrichment. 

'^This way of constructing automata is sometimes discredited, with preference given to exclusively liigli-level algebraic character- 
izations of the underlying languages and relations (Kaplan & Kay 1994, 376). However, in line with van Noord & Gerdemann (1999) 
who argue convincingly for a more flexible overall approach, there are good reasons to deviate from Kaplan&Kay's advice in our 
case. Concretely, while it is possible to define the appropriate automaton for all strings of length 1 through the regular expression 
ContentSymbol {ContentSymbol repeat)* , concatenation of n such expressions for a string s, \s\ = n does not define the 
same language as the repeat-enriched automaton corresponding to s. I conjecture that only the automaton-based approach can be 
compositional with respect to concatenation. 
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To give a simple idea of how skip arcs might be used, consider the following regular expression for 
picking out the base portion of the hypocoristic form Pet-i: 

seg* skip skip 

When intersecting this expression with (16), the string pet skip skip results. A more principled analysis 
of German hyporistic formation follows in 5.2. 

Noting that in this solution technical and content arcs systematically share both source and target states, 
we can actually merge those two arc types and avoid the additional complexity introduced by separate skip 
arcs. Under a set-based labelling regime this works as follows: every content arc will now be labelled with 
the union of the old content symbol and the skip symbol (type formula: Content Symbol; skip). Thus 
separate skip arcs would only be strictly necessary when for some reason set labels are not available. 

4.3 Resource Consciousness 

So far we have ignored a specific problem associated with self loops, the mechanism proposed in (15) 
to handle free infixation. The problem is that the same self loops that were designed to absorb infixal 
content may also absorb accompanying contextual constraints in unexpected ways. It is best to illustrate 
this unwanted interaction by way of an actual example that is already familiar, Tagalog -um- infixation. 
Since the infix lands just behind an obligatory word-initial stretch of syllable onsets, one particularly simple 
way of encoding this prosodic requirement is to attach it to the left side of the infix itself. In doing so we 
assume that segments are tagged with syllable role information by means of finite-state syllabification (not 
shown here). 
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(17) Automata for Tagalog -mot- infixation 




E E Z Z Z 




E 



However, upon intersection of (17). with the self-loop-enriched representation of a stem like basa 
(17).b the resulting complex automaton overgenerates. It contains at least one ungrammatical placement 
of the affix-plus-left-context in absolute prefixal position: onset^u mbasa versus the correctly infixing 
b&zonset u m a s a. This is because the affix proper and its prosodic context are both consistent with basa's 

initial self loop 0^0, causing a kind of vacuous self-fulfilment of the contextual constraint which now 
hallucinates onsets that were never provided by the stem itself. 

Note that it is not possible to remove the initial self loop: ordinary prefixation using e.g. mag- (17).c. 
can apply to the same stems that take -um-. Therefore, unless we are willing to give up the overall idea of 
morphemes-as-constraints that are uniformly combined via intersection, we should use intersection in this 
case as well, with the initial self loop now playing host to the prefixal material. The argument becomes 
particularly compelling in the case of conditioned allomorphy. Ellison (1993) has argued that because such 
cases of contextuaUy restricted morphemes exist even in agglutinative languages like Turkish and because 
intersection can enforce restrictions and simulate concatenation but not vice versa, it should become the 
preferred method of morpheme combination in a constraint-based setting. 

What then is the cause of our problems in (17)? My diagnosis is that we lack an essential distinction 
between producers and consumers of information. Contextual constraints should only be satisfiable or 
'consumable' when proper lexical material has been provided or 'produced' by independent grammatical 
resources. Now this notion itself is already familiar from other areas of computational linguistics, going 
back at least to LFG's distinction between constraining and constraint equations (Bresnan & Kaplan 1982). 
Since then the general idea has gained some popularity under the heading of resource-conscious logics.^"* 

Here I propose to introduce resource consciousness into automata as well. Suppose we tag symbols 

"Note that -um- is specified as a contiguous morpheme by leaving out morpheme-medial self loops. This reflects the fact that the 

infix itself never gets broken up. 

'"•See e.g. the 'glue logic' approach to the syntax-semantics interface in LFG (Dalrymple 1999); asymmetric agreement under 
coordination (Bayer & Johnson 1995); Johnson (1997)'s development of a more thoroughly resource-based R-LFG version; Dahl, 
Tarau & Li (1997)'s Assumption Grammar formalism; Abrusci, Fouquer^ & Vauzeilles (1999) logical formalization of TAGs. 
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with a separate producer/consumer bit (P/C bit) to formally distinguish the two kinds of information. P/C = 
1 defines a producer, P/C = its consumer counterpart. By convention, let us mark producers by bold print 
in both regular expressions and automata; see (17) for illustration. Suppose furthermore that we distinguish 
two modes of interpretation within our formal system. During open interpretation the intersection of two 
compatible arcs produces a result arc whose P/C bit is the logical OR of its argument bits, whereas in closed 
interpretation mode the result arc receives a P/C bit that is the logical AND of its argument bits. Draw- 
ing the analogy to LFG, open interpretation mode is somewhat akin to unification-based feature constraint 
combination during LFG parsing, while closed interpretation mode would correspond to checking the sat- 
isfiabiUty of constraining equations against minimal models at the end of the parse. In open interpretation, 
producers are dominant in intersective combination, so that the only consumer arcs surviving after a chain 
of constraint intersections are those that never combined with at least one producer arc. This is similar to 
the behaviour found in intuitionistic resource-sensitive logics, where a resource can be multiply consumed, 
but must have been produced at least once. 

Open versus closed interpretation imposes a natural two-phase evaluation structure on grarmnatical 
computation. After ordinary intersective constraint combination using open interpretation mode obtains in 
phase 1, phase 11 intersects the resulting automaton with S* - the universal producer language - in closed 
interpretation mode. Because as a result only producer arcs survive, the second step effectively prunes away 
all unsatisfied constraints, as desired. Observe that all the real work happens in phase I, which is fully 
declarative. Phase 11 on the other hand is automatic and not under control of the grammar'*' To indicate 
that declarativity has only slightly been sacrificed, we will call the resulting grammatical framework that 
obeys two-phase evaluation and the open/closed interpretation distinction quasi-declarative, and speak of 
Quasi-Declarative Phonology etc. 

It is now easy to see that our introductory problem of getting b-um-asa right has been solved: in the 
informed alternative corresponding to word-initial position the consumer-only onset arcs that constitute 
-um-'s contextual constraint meet a consumer-only — > self loop. Since consumers intersecting with 
each other in phase 1 remain just what they are, they are immediately eliminated when intersecting with 
the universal producer language in phase 11 of grammatical evaluation. In later sections we will encounter 
further examples that underscore the utility of a resource-conscious style of grammatical analysis and 
demonstrate its wide applicabihty in prosodic morphology. 

4.4 Copying as Intersection 

In this section I will explain a generic method to describe reduplicative copying using finite-state opera- 
tions. The germ of the idea already appeared in Bird & Ellison (1992, 48), where the authors noted that the 
product of automata, i.e. FSA intersection, is itself a non-regular operation with at least indexed-grammar 
power. In illustrating their claim they drew attention to the fact that odd-length strings of indefinite length 
hke the one described by the regular expression {abode f g)~^ can be repeated by intersecting them with 
an automaton accepting only strings of even length, yielding {abode f g abode f g)~^ in the example at 
hand. 

Since we already know now that with intersection we have a promising operation in our hands to 
implement reduplication, let us work out the details. First, we will show how to get total reduplication in a 
way that makes use of neither the odd-length assumption of Bird & Ellison's toy example nor of a priori 
knowledge about the length of the string as in our preview of a possible application of repeat arcs (14). 
An initial step that improves on that previous account for reduplicating the repeat-encoded malo string 
dispenses with the length knowledge by using the following regular expression: 

m seg* o repeat* m seg* 

'^Because pruning in closed interpretation mode is based on examination of an arc's content, specifically its P/C bit, our proposal 
differs from Bird & Ellison (1992)'s purely structural prune{A) operation, which indiscriminately removes all self loops from a 
state-labelled automaton A. 

'*This is to be understood as a conceptual statement. For practical experimentation we have devised a macro 
closed-interpretation whose use is visible in formal grammars. It follows that open interpretation mode is the default setting 
in our implementation. 
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Although we have now replaced 4 (= \malo\) consecutive repeats by repeat*, an expression which 
encodes jumping back an indefinite amount of time, it is clear that something else is required to generalize 
beyond this example. In comparison to words Uke wulu 'dog', malo is special in that m and o are distinct 
symbols that occur only once in the string. Thus they are able to serve two functions at the same time, 
acting as ordinary content symbols and as markers of left and right edges of the reduplicant. 

That observation points to the crucial issue at hand, which is how to identify the edges of a redupUcant 
in a generic fashion. In malo it just happened that the edges were already self-identifying. For the general 
case we may borrow freely from Bird & Elhson (1994, 68f)'s solution to an analogous problem arising in 
autosegmental phonology, that of modelling the synchronizing behaviour of association lines. Just like in 
their solution, let us assume a distinct synchronization bit. Here it will be added to all content symbols, 
being set to 1 for the edges we want to identify and receiving the value elsewhere. Adopting Bird & 
Elhson's notation for combining content and synchronization information, we can draw the automaton for 
wulu as follows.'^ 

(18) Repeat Arcs with Synchronization Bits 




Now the regular expression that describes total reduplication can be hberated from mentioning any 
particular segmental content: 

seg:l seg:0* seg:l repeat* seg-.l seg:0* seg:l 

Figuratively speaking we start at a synchronized symbol representing the left edge, move right through a 

possibly empty series of unsynchronized segments to another synchronized symbol representing the right 
edge, then go back through a series of repeat arcs until we encounter a synchronized symbol again, which 
must be the left edge. Note how the same subexpression is used twice to identify 'original' and 'copied' 
occurrence of the reduplicative constituent. With more instances of seg:l segiO* seg-.l, triplication, qua- 
druplication etc. would all be feasible using the same approach. Also, it is interesting to see how one bit 
actually suffices in this scheme to identify two kinds of edges in all strings of length > 1, exploiting the 
fact that concatenation is associative but not commutative. 

To handle actual Bambara's Noun-o-Noun reduplication, we need to combine the enrichments of section 
4.2 and 4.3, in particular using one of the self loops to provide space for the intervening /o/. Here then is 
the full encoding of both wulu 'dog' and the reduplicative construction itself: 

'^Ellison (1993) contains an earlier application of synchronization symbols to the problem of translating concatenation into in- 
tersection. EUison's comment that only a finite alphabet ({0,1}) is needed in the translation carries over into the present setting: 
synchronisation symbols need not be multipUed for triplication, quadruplication, etc. 

'*In our typed setting, we actually use a new type synced, writing ContentSymbol&zsynced for ContentSymbol-.l and 
ContentSymbol&i^synced for ContentSymboliO. Type declarations ensure that each instance of ContentSymbol is itself 
underspecified with respect to synchronization. 
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(19) wulu AND REDUPLICATION AUTOMATON 




After intersection (19).a fl (19).b and pruning away of consumer-only arcs we get an automaton which 
is equivalent to 

w:l u:0 1:0 u:l o repeat'^ repeat* w:l u:0 1:0 u:l 

There was nothing special in intersecting with a singleton lexical set, hence it is trivially possible to ex- 
tend the Bambara lexicon beyond the single example we have shown. All that is needed is to represent new 
lexemes as FSAs in the same vein as wulu, and define the lexicon as the union of those FSA representations. 

Also, it is easy to see that we can construct a variety of other redupUcation automata that use different 
sychronization and repeat patterns or employ additional skip arcs. These would encode the various types of 
partial reduplications that we saw in section 3.1.1 will indeed present one such case in section 5, but leave 
the others as an exercise to the reader for reasons of space. 

4.5 Bounded Local Optimization 

This section defines a final operation over enriched automata called Bounded Local Optimization (BLO), 
which can be understood as a restricted, non-global search for least-cost arcs in weighted automata. The 
operation will be shown to permit implementation of the Incremental Optimization Principle (lOP, p.ll), 
while later sections illustrate that it can also be used for morpheme drift and longest-match behaviour. 

To prepare the ground for such an operation it is best to return to the simple Tonkawa example we- 
pcen-o? (cf. (2)) that was discussed previously in connection with the lOP. Given a stem representation 
p{i)c{e)n{a) that contains three zero-alternating vowels, the addition of two nonaltemating affixes we- and 
-o? does not enlarge the resulting set of eight (2"^) word forms. Intersection of this set with some simplified 
prosodic constraints *CCC and *VV that ban sequences of at least three consecutive consonants and two 
adjacent vowels still leaves us with three remaining forms. Here the lOP steps in, preferring we-pcen-o? 
over *we-picen-o?, *we-pjcn-o? because only the first form lacks the i that constitutes the earhest omittable 
vowel. 

To implement this kind of behaviour, the first idea is to extend the FSA model once again, namely to- 
wards the inclusion of local weights on arcs. The weights are taken from a totally ordered set and represent 
the costliness of a particular choice. In our example, the realization of an alternating vowel should be more 
costly than its omission, e.g. by receiving a greater weight. Now this move in itself fits in with the recent 
gain in popularity that weighted automata and transducers have enjoyed, finding application in areas such 
as speech recognition, speech synthesis, optimality theory and others (Pereira & Riley 1996, Sproat 1996, 
ElUson 1994b). Usually, however, the theoretical assumption has been that the minimal weighted unit is the 
string, not the individual symbol. Taking advantage of this assumption, Mohri (1997) is able to both move 
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and modify individual weights between arcs in an operation called 'pushing' in order to prepare a weighted 
automaton for minimization. In our application, though, weights represent localized linguistic information 
which should not be altered. Therefore, I will at present pursue the alternative of weighting the symbols 
themselves. Also, for purposes of this paper it will actually suffice to introduce a small finite number of 
different weights. Hence the weights used here can equally well be formalized as part of the type hierar- 
chy that structures label sets, and that is indeed what the current implementation does. While it would be 
definitely be worth trying to take the other route as well and make the transition to the general case of unre- 
stricted, possibly numeric, weighting schemes that build upon the weighted-string assumption, the choice 
seems premature right now. Rather, it seems best to wait until the analysis of enough decisive phenomena 
has been carried out in the present framework and then evaluate what the ultimate consequences of each 
assumption are. 

Here then is one automaton representation of the {we-pcen-o?, we-picen-o?, we-picn-o?} set, enriched 
with only two weights representing the marked realization case (depicted as /I) and the unmarked elsewhere 
case (depicted as /O). The ordering that will be assumed is marked > unmarked. 

(20) Weighted automaton for Tonkawa example 




Note that in particular affix vowels are correctly represented as unmarked, because they are nonalter- 
nating, just like all members of the consonantal 'root' p — a — n. 

Now, how does one use the weights to implement the lOP? The crucial observation is that lOP's pruning 
of costlier alternatives translates into local inspection of the arcs emanating from states like 3 and 10 in 
(20). In each of the choices 3 — » 4 or 9, 10 — > 5 or 6, there is an alternative between a marked arc and 
an unmarked one. Let us call such an arc A e State.arcs a choice arc whenever \State.arcs\ > 1. 
By cutting away marked, i.e. non-optimal choice arcs, we arrive at an automaton which recognizes the 
single string wepceno?, as desired. Since it is desirable to abstract away from the specifics of this example, 
we will in the following develop a new operation called Bounded Local Optimization (BLO) which will 
encapsulate the locally-determined pruning of non-optimal arcs. To make it widely applicable, we introduce 
two generahzations into our example procedure. 

The first generalization is to prune only those arcs from the set S.arcs whose weight is greater than the 

c/O i/1 

minimum weight over this entire set. For example, S.arcs = {3 — > 4, 3 — > 9} has the associated minimum 

e/l 

weight 0. As a consequence, preservation of non-choice arcs like 4 — > 5 is automatic, since the minimum 
over singleton weight sets is independent of the only element's weight value. Also, the generalization 
means that even multiple choice arcs will survive pruning iff they are all weighted with the minimum cost, 
thus providing a way to maintain alternatives beyond the pruning step, e.g. to implement free variation.^" 

The second generalization is to parametrize the operation under development with a fixed look-ahead 
of k arcs, summing up weights over each individual fc-length path extending from a given state. In our 
running example, k ~ 1 was sufficient to detect gradient wellformedness differences in a maximally local 
fashion. In general, though, one might need to examine a greater number of consecutive arcs to discover 
the (non-)optimahty of an alternative path. For example, if the arc labels of 3 — > 9, 9 — > 10 switched places 

''interestingly, Kiraz (1999) reports that actual grammars for Bell Labs text-to-speech applications also obey this restriction, e.g. 
the German module has 33 weights and the French module 12. 

One slight modification that is not pursued here (but will be assumed in §5.3 would add a mechanism to make designated 
arcs inert to minimum-based pruning. The rationale behind this move is that there are scenarios where e.g. technical arcs would be 
compared to content arc alternatives of various weight in a nonsensical way. A simple way to signal inertness is by negative weights: 
an arc is pruned whenever the length-A: alternatives yield a positive lower summed weight. With the help of a special value — oo we 
can even prevent arc pruning independent of k and the actual weight distribution in the alternatives. 
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and arc 10 ^ 6 was eliminated, we would need fc = 3 to see that 3^9 initiates a costlier path, hence 
should be optimized away (X;3-*9^10^5 = 2,X;3^4^5^6 = 1). 

To sum up, calUng BLO a kind of optimization is justified because non-minimally-weighted choice arcs 
are pruned; the optimization is boundedly local because no path of length greater k influences the decision 
of which arcs to prune. In contrast, the (N-best) shortest-path algorithms (Dijkstra 1959, Tarjan 1983) 
used in most other appUcations of weighted automata constitute global optimization procedures which 
will use information from the entire automaton to determine their result. Although the latter optimization 
procedures could presumably be used as part of the aforementioned general exploration of the weighted- 
string alternative, pursuing the more restricted local variant is preferred here, because it incorporates an 
interesting hypothesis about what formal power is actually needed in processing prosodic morphology.^' 
Another consequence of adopting BLO is that it makes the question of what maximal look-ahead is (per- 
haps universally) required a topic of promising empiricial research, which could shed further light onto the 
resource-conscious structure of natural language patterns. 

Let us now proceed from informal sketches to a more precise definition of BLO. First, I will give a 
different characterization of BLO as a function that maps between weighted formal languages, in order to 
ensure its representation-independent definability. Such a characterization is desirable, since for any fixed 
k and suitable locally-weighted language L one can construct 'illbehaved' automaton representations, e.g. 
by inserting epsilon transitions, so that non-oplimal paths cannot be detected within a window of length 
k. Therefore, in a second step 1 will briefly consider the automaton-theoretic implementation of the BLO 
again, focussing in particular on the question of which automaton representation acts as wellbehaved input 
to it. 

In the language-theoretic characterization of BLO, then, the idea is to sum over (the local weights 
of) A;-length substrings, the equivalent of A;-length arc paths. We discard a string w when there is at least 
one position whose associated fc-length sum exceeds the minimum sum for that position obtained through 
evaluation of other comparable strings, i.e. those which share a common prefix. For the case where only a 
substring of length j < k exists, we simply define that its weight sum is the sum of the existing weights 
up to position j, or equivalently that non-existing positions have impUcit weight 0. Here then is the formal 
version of BLO: 

(21) Given an alphabet S as a finite, nonempty set of symbols and a set of positive, real-valued weights 
R_l_, a locally weighted language L is defined as follows: L C (E x M+)*. We will also speak of a 
locally weighted string w whenever it; e i f or some locally weighted language L. Finally, let w[i] 
pick out the i-th pair in w for < i < \w\,\eXw]f) . . .i] denote the length-i prefix of w and let -K2 
be the projection of the second element of a pair. 

Th&n Bounded Local Optimization BLO : i x N \ {0} i-^ L is defined as 

BLO{L, k) = {w e L\$pos eN,0<pos + k< \w\, $v G L: 
v[0 . . .pos] = w[0 . . -pos], 

weight-Sum{pos, k, v) < weightsum{pos, k, w)}, 



and, with k,pos S N and w a locally weighted string. 



weightsum{pos, k,w) = < 



ro, 

7r2(w [pos]), 

weight-Sum{pos, 1, w) -|- 
,weight-Sum{pos, k — 1, w), 



fc = 1 A pos > \w\ 
k = 1 A pos < [wl 

k > 1 



^'interestingly, Mohri, Riley & Sproat (1996, 96ff) also explore (different) incomplete optimization methods that visit only a subset 
of the states. However, their motivation is to solve a practical problem in speech recognition, namely that the enormous number of 
states prohibits plain application of single-source shortest-path algorithms on current hardware. 
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To illustrate the operation just defined, suppose 



L = {^1,^2}, with wi = <a,0><b, 1>, W2 = <c, Ixd, 0>. Then BLO{L, 1) = {wi}, because 
u;i[0...0] = W2[0...0] =eand 

weight.sum{Q, 1, w^) = 7r2(<c, 1>) = 1 > = 7r2(<a, 0>) = weight-sum{0, 1, wi), 
together with the fact that examination of position 1 leaves the optimality of wi unchallenged, 
since no common prefix exists: wi [0 . . . 1] = <a, 0> 7^ W2[0 . . . 1] = <c, 1>. 

Thus, the example illustrates that BLO is in fact a 'greedy', directional type of optimization: weights are 
evaluated by position such that a string cannot compensate for an initial costly string portion by some 
cheaper suffix if an alternative with cheaper prefix exists. This is a welcome result since one appUcation of 
BLO is to model the lOP, whose original formulation "Omit zero-alternating segments as early as possible" 
was intentionally defined in directional terms. Although BLO's directionality is perhaps not immediately 
apparent from the declarative definition, it does in fact follow from the local examination of weights em- 
bodied in weight_sum together with the common prefix requirement. Note also that BLO{L, k)—L for 
fc > 1, because by weight_sum{0, k, Wi/2)=^ both strings are kept. Hence, we see that optimization re- 
sults are not necessarily monotonic with respect to parameter k. In particular, care must be taken to avoid 
look-ahead windows that are too big, because - as we have just seen - overstretching the bounds of locality 
can sometimes blur distinctions expressed by the weights. 

There is a last peculiarity worth noting, which has to do with the entity over which BLO, or any other 
grammar-defining optimization approach, for that matter, should be applied. Suppose that in our illustra- 
tive example L the strings wi and W2 actually represented different lexical items rather than realizational 
alternatives of a single item. The result BLO{L, l)=wi then means that with W2 unfortunately a lexical 
item itself has been pruned, or in other words, that optimization cannot be safely applied over the entire 
lexicon. Rather, BLO must be appUed on a per-item basis, at least conceptually. There are various ways 
to actually implement this requirement. One way would be to prefix each lexical item with string-encoded 
semantic and morphological information that is weighted with the same item-independent weight. The 
prefixes make item beginnings unique, so they will be preserved even in a minimized FSA version of the 
lexicon, and the uniform weights ensure that no item will be prematurely discarded, thus banning harm- 
ful interaction. However, one drawback of such a scheme would be that FSA minimization would have 
less chances of reducing the size of automata, as compared to the usual encoding of grammatical infor- 
mation at the end of phonological strings (Karttunen, Kaplan & Zaenen 1992). The latter encoding could 
be preserved in a second scheme where during generation one first intersected the lexicon automaton with 
H* <semantic/morphological features of desired word form> and then applied BLO to the result. Finally, 
one might devise an algorithm to predict which arcs potentially participate in harmful interaction and prefix 
only these with disambiguating information, in what might be seen as an attempt to use the first method on 
a demand-driven basis for improved minimization behaviour. 

Returning to the question of what machine representation a of a locally weighted language L is sound 
input to an automaton-based implementation of BLO, I conjecture that a sufficient condition for soundness 
is that a takes the form of the minimal deterministic automaton for L having a single start state. To see 
the plausibility of this conjecture recall that while a minimal automaton is defined to have the minimal 
number of states, it is also minimal in number of transitions (Mohri 1997, Corollary 1). Being minimal and 
deterministic, string prefixes are shared wherever possible and there are neither epsilon transitions nor is 
there useless nondeterminism. Thus the remaining choice arcs must encode non-reducible local choices. 
While the BLO algorithm considers each of the choice arcs emanating from a given state for pruning, it 
suffices to examine the summed weights of length-A: paths starting with those arcs because the condition 
of common prefixes in definition (21) is already guaranteed through sharing. Because a has a single start 
state, it follows that the condition is also met for the case of the empty prefix. 

Given this clarification about the nature of BLO input, we are now in a position to present the algorithm 
in pseudocode for the automaton-based implementation in (22). 
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(22) BLO Algorithm 



BoundedLocalOptimization(Q:, k) 

1 p.trans <— visited <— 

2 states <— (3. start ^ a. start 

3 while states 7^ do 

4 gr ^ DEQUEUE(states) 

5 visited <— visited U {q'} 

6 if |(7. arcsj > 

7 then {nextstates, nextarcs) <— NextArcsOnMinimalPaths(g, /c) 

8 ENQUEUE(stafes, nextstates — visited) 

9 p.trans ^ (i.trans U nextarcs 

10 if g G a. final 

11 then 13. final ■*— p. final U {g} 

12 return /3 



The algorithm takes a locally weighted automaton a and the look-ahead constant k as input. Line 1 
initializes the set of transitions of the result automaton (3 and the set of already visited states to empty, 
while line 2 copies the input start states to both the result start states and the set of unprocessed states. 
While there are states to process (line 3), we remove a current state q from this set and mark it as visited 
(Une 4-5). If that state has outgoing arcs (line 6), we examine all paths of maximum length k that originate 
at q and return those nextstates C {dest\q dest} and nextarcs C q.arcs that were found to lie 
on one of the paths with minimal weight sum (line 7).^^ We then add the new, i.e. non-visited 'minimal' 
states found in this way to the set of unprocessed states (line 8). Note that this step is responsible for the 
local, incomplete exploration of the state set of a: 'non-minimal' states will not be considered in further 
iterations (unless, of course, other arcs happen to refer back to them). In the next step we add nextarcs 
- the pruned subset of q's outgoing arcs - to the transition set of the result automaton (line 10). Finally, 
if the current state was final in the input, then so must it be in the result (line 1 1). After the loop over all 
'minimal' states has been completed, we return the finished result automaton /3 in line 12. 

It is easy to see that the algorithm in (22) must always terminate. The set of states that controls the 
only existing loop is initialized to a finite set of start states at the beginning. While line 8 from the loop 
body increases states by some amount which is bounded by max (\q.arcs\), it simultaneously ensures 

q^a. states 

that no state will be added more than once due to the subtraction of visited states (visited itself grows 
monotonically, line 5). Because by definition \a. states] < 00, the total increase must be finite as well. 
Since each iteration unconditionally removes one element from states, the nonemptiness condition in Une 
3 will evaluate to false after a finite number of iterations, as required for the proof of termination. 

Similar ideas have been explored under the heading of locality in violable constraint evaluation by Tesar 
(1995) and in particular Trommer (1998, 1999). Trommer (1998, p.30,fn.l2) explicitly acknowledges the 
intellectual debt to the Incremental Optimization Principle of Walther (1997), which is also the precursor 
of BLO; both of Tronmiers papers apply the local evaluation concept in an interesting way to Mende tone 
data. However, while the clearest exposition of his local optimization algorithm is in Trommer (1999), 
there are a number of differences and problems. Trommer's Optimize(T) is defined as an algorithm over 
transducers only, there is neither a characterization in terms of regular relations comparable to our language- 
theoretic definition of BLO nor a discussion of the dependency on a suitable normal form for automata. 

^^Note a slight complication that arises when type-based disjunctive arc labels are allowed: sometimes even a single arc like 

i ^^h^'^ j type-encoded weights and 1 must not be pruned alltogether but rather have non-minimal disjuncts removed, 

as in i j. The necessary generalization of NextArcsOnMinimalPaths((j, k) is easy: one simply collects the set of (summed) 
weights W for any given arc (path) using n type subsumption checks that test containment of each of the n weight types, and then 

intersects the arcs q.arcs with > 1 to effect pruning. However, for expository purposes we will stick with the conventional 

one-arc-per-disjunct assumption, at least in connection with BLO. 
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His algorithm definition is both somewhat erroneous (lines 8,9) and not formulated as an incomplete search 
that directly exploits the computational advantage of locality. Finally, the generaUzation to a look-ahead 
fc > 1 is missing, and Optimize(T) is used at each step of a constraint cascade, in contrast to the restricted 
use of BLO as a one-step filter on the final result of autonomous automata intersections. 

4.6 Flat representation of prosodic constituency 

So far we have avoided to take any stand on the issue of which set of prosodic categories to assume, how to 
conceive of the relationships between such categories and how to represent these in a finite-state framework. 
For concreteness, we will now briefly consider the topic in a bit more detail. However, perhaps somewhat 
suprisingly, prosodic constituency above the level of sonority will not be strictly necessary in any of the 
three case studies under §5. Thus, the reader may skip this subsection on a first reading, resting assured 
on the principal result developed below, namely that the present framework freely allows for conventional 
prosodic constituency, albeit in a new representational format, whenever the empirical facts or different 
styles of analysis seem to warrant its inclusion. 

Since at least Selkirk (1980) many authors have assumed that phonology above the segmental level 
is organised in a fashion much similar to syntax, employing hierarchical structure to represent prosodic 
constituency. In Selkirk's work, for example, the categories of syllable a, foot S and prosodic word ui 
are proposed, together with subscripted s(trong)/w(eak) modifications to mark up subcategories and su- 
perscripted primes to tag supercategories. Hence, a word like English sensational receives the following 
prosodic representation (23). 

(23) English sensational ACCORDING TO Selkirk (1980, 601) 




sen sa 



tio nal 



Though not depicted by Selkirk, one might proceed similarly below the syllable level with syllabic 
roles that ultimately connect to segments (24). 

(24) English sensational: possible syllabic structure 



ONCONCONCONC 



j J 



In our case we have used the four roles O, N, C, CO for onset, nucleus, coda and codaonset, the latter 

being a representation for ambisyllabic segments and geminates (cf. Walther 1997, ch.3). Of course there 
are many competing proposals as to which categories to adopt and how to make best use of the dominance 
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relationships. Yet all of these proposals share the common assumption of a finite category set; hence for 
formal purposes the examples just given suffice to illustrate our main point. 

This main point is how to linearize such graph- structured representations in conventional finite-state 
models. In particular, a perspicuous lossless encoding of both dominance and immediate precedence re- 
lationships is needed. Moreover, we would ideally want a kind of distributed representation where the 
properties denoted by categories can be locally inspected rather than, say, demanding a nonlocal reference 
to some distant boundary symbol in a traditional bracketed notation. 

Towards this goal, our leading idea will be to reinterpret the transitive dominance relation as mono- 
tonic inheritance. Now Bouma & Nerbonne (1994, 44f) have pointed out that one of the restrictions of 
the inheritance relation, as it is usually defined, is idempotency: inheritance cannot distinguish between 
structures that differ only in recursion level (e.g. anti-anti-missile ^ anti-missile). Fortunately, the above 
diagrams - and most others in the phonological literature - contain no such recursion in the literal sense. 
Although a super-foot category dominates an ordinary foot in (23), it is distinguished by a prime. To 
proceed with inheritance, we therefore demand that occurrences of any such pseudo-recursive categories 
must be pairwise distinct, which can be achieved by means of e.g. X-bar levels or primes that form part of 
the symbol at hand. 

A second restriction imposed specifically by monotonia inheritance is commutativity under associativ- 
ity: if C inherits from B and B inherits from A, then the result is the same as if C inherits from A and A 
inherits from B. This equivalence implies that for purposes of simulation-by-inheritance the order of dom- 
inance must not matter. One way to ensure this is to fix a particular order in advance. We therefore demand 
that the dominance relation of any constituent structure must be consistent with an a priori given total order 
>dom over the set of categories: Vx^y : category{x) A category{y) A do'minates{x,y) x >dom V- 
In our examples, we would have w >dom ^s,uj ^dom '^s,w >dom <^{s,w) >dom O, N, C, CO. Heucc, a 
diagram where e.g. a dominated would be ruled out as inconsistent. 

With only a finite set of prosodic categories left that enter into formally non-recursive structures and 
moreover respect the dominance precedence relation >dom, the recipe for flattening a given structure is 
now quite simple to formulate (25). 

(25) a. To prepare classification of occurrences of category X, set up the following type hierarchy for 

def 

each nonterminal X e CategorySet = {c\3x : c <dom xV x <dom c}: 
X 



The intuition behind this is that category X is best modelled as a phonological event (Bird & 
Klein 1990), i.e. a temporal interval bearing the property X. On the standard assumption that 
there are terminal categories whose concatenation forms the 'terminal yield' of a category-as- 
temporal-interval, we then will be able to tag each of those terminal elements for their relative 
position within the interval (cf. also Eisner 1997a). In actual phonological practice, terminals 
will frequently be segments, but could also be features etc. Left or right brackets in boundary 
subtypes signal interval begirmings or endings whereas the underscore symbol as left or right 
part of a subtype stands for nonempty context, i.e. there is at least one temninal to the left or 
right. 

b. Whenever terminal type T, transitively dominated by category X e CategorySet, is found in 
initial or medial or final position of the terminal yield of X, add a conjunct \X or _X_ or X] 
to T. The 'or' is 'inclusive OR', in particular to cover length- 1 terminal yields. 

c. Whenever terminal type T is not transitively dominated by category X e CategorySet, add 
a conjunct -^X to T. 

d. Add a conjunct ^ [X] to all terminals whose type formula is not maximally specific with respect 
to category X. This step ensures full specification for boundary occurrences in those intervals 
which either contain more than one atom or are not multiply dominated. 




[X. 



[X] 



-X] 
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To exemplify: (26) shows a flat representation of the joint diagrams of (23) and (24). (Note that for abbre- 

viatory purposes we assume here that Y.^ ^ is the supertype of eL' and si '). The reader may verify herself 
that we can indeed recover the graph-structured version provided that the dominance precedence relation 
>dom is known. 

(26) Flat representation of sensational 



s A [O] A [cr. A A -.S' A [co. 
£A[N]A .a. A .S^_ A ^E' A .a;_ 
n A [C] A .cr] A .S^] A -.E' A .uj. 
s A [O] A [(7s- A [Eg- A [E^_ A .w_ 
e A [TV] A _cr^_ A .S^. A _S^_ A 
j A [C] A .as] A _Es_ A _E^_ A _a;_ 



J A [O] A [(7w- A .Es. A .E^_ A 
a A [Af] A .CTu,. A .E^. A .E^_ A .a;_ 
n A [CO] A [ct^] a _Es] A .E^_ A 
a A [TV] A .cr^_ A -.E A .E^_ A .a;_ 
1 A [C] A _C7^] A -.S A _S^] A .to] 



The impact of having a linearized distributed representation of (prosodic) constituency is twofold. First, 
we can now refine prosodically underspecified segmental strings with suitable constraints that speciaUze 
for each layer of the prosodic hierarchy. For example, given a finite-state version of declarative syllabifica- 
tion (Walther 1992, Walther 1995) for predicting syllabic roles from segmental information (itself using an 
intermediate layer of sonority difference information), the next layer would use local syllable role config- 
urations to demarcate syllable boundaries [cr, cr] and syllable interior _cr_^^ , and so forth. 

Second, we can freely use this locally encoded prosodic information to condition both generic and 
construction-specific constraints that must reflect some dependency on a given prosodic configuration. 
Eisner (1997a) illustrates, from the perspective of his Primitive Optimality Theory, just how appealing 
such local encodings can be for purposes of compact constraint formulation. Because in his results the 
emphasis is on locality in representational formats rather than on violabiUty, one can be confident that their 
advantages will be preserved in the present framework. 

4.7 Parsing 

So far we have described OLPM from the perspective of generation only. Because reversibility is usually 
held to be an important property in practical applications of finite-state networks, we will now briefly 
consider how to do parsing under the new framework. 

Disregarding optimization at first, parsing seems next to trivial. The central mechanism for constraint 
combination is automaton intersection, an associative operation that supports reversibility. Under this view 
one would simply intersect the string to be parsed with the FSA constituting grammar and lexicon; a 
nonempty result would then signal successful recognition. In the face of a structure-building grammar that 
adds e.g. syllable role information or other prosodic annotations, we of course should represent the parse 
string symbols as prosodically underspecified segmental types to aUow for their subsequent speciaUzation 
in the process of intersection. 

However, an inomediate compUcation is that in OLPM the technical arcs skip, repeat would prevent 
literal matching of even structurally underspecified surface strings with the grammar If e.g. some redupli- 
cated form is to be recognized, the grammar will assign several repeat arcs as part of the 'surface' string, 
and this decorated surface form will then fail to intersect with the plain, undecorated parse string. The solu- 
tion is to employ a trivial preprocessing step at the interface between phonetics and phonology: enrich the 
automaton corresponding to an undecorated parse string with consumer self loops that tolerate exactly the 
set of technical symbols. Note that automata enriched in this way are still rather different from transducers, 

^^For example, by way of the following monotonic rules: OV N [a / -^O , N \/ C a] / -^C\ O -U- / OW CO , 

N V C ^ jj- 1 C V CO. On automata representations of monotonic rules, see Bird & Ellison (1992, 34f). The disjunctions in 

the preceding rules can even be eliminated with a suitable featural decomposition of syllable roles using the features [itonset] and 
[±coda] (Walther 1997, §3.4.3). 
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even granting a simulation of composite arc symbols as consecutive arcs 

in conventional automata (cf. fig. 9 in US patent 5,625,554 granted to Xerox on April 29, 1997). This is 
evident from the fact that, in contrast to their behaviour in the simulation, odd arcs do not consistently play 
the role of inputs and neither can even arcs be seen as corresponding outputs. 

A second issue is that in parsing one would normally want a litde more than mere recognition of 
grammatical forms (and rejection of ungrammatical ones), namely categorial information in the form of 
morphological, syntactic and semantic properties or features. Although we have been silent on this issue 
up to now, it is actually simple to represent the required annotations in grammar and lexicon by reserving 
one or more final arcs at the end of automata for appropriate category labels, just like in the transducer- 
based proposals of Karttunen, Kaplan & Zaenen (1992). (Categorial information is again supposed to be 
pairwise disjoint from technical and segmental type information.) However, unUke in the FST version, 
where one can map underlying categorial information to the empty string e on the surface, in our one- 
level version this information would again be visible in the surface string. As a consequence, the above 
preprocessing step needs to be slightly modified to tolerate categorial information in those self loops that 
are attached to final states. Finally, the parse string itself constitutes an unconfirmed hypothesis that needs 
verification by independently produced grammatical and lexical resources. This means that - at least if 
self-loop enrichments are present in the grammar - it is necessary to formally mark each segment of the 
parse as a consumer. Only when parse-segments-as-consumer-hypotheses intersect with matching producer 
segments from the lexicon, will they survive phase II of the two-stage evaluation procedure outlined in §4.3. 

The following definition of a parse operator in (27) accurately reflects the preceding discussion. 
Because it makes use of the notational format of the Prolog-based FSA toolbox that will only be intro- 
duced later in §5, the reader is urged to come back to this section on a second reading. Note in par- 
ticular the interspersed self loops, defined via the Kleene star operator * and the intersection & of the 
preprocessed (ParseString) with grainmar_and_lexicon. 

(27) Parsing (in the absence of optimization) 



preprocessed ( [ Surf aceSegment | Re st Segment s ] ) := 

preprocess (RestSegments , Surf aceSegment ) . 

preprocess ( [ ] , LastSegment) : = 

[consumer (LastSegment & segment), 

consumer ( (technical_SYmbols ; categorial_information) ) *]. 

preprocess ( [CurrentSegment I RestSegments] , PreviousSegment) := 
[consumer (technical_symbols) *, 
consumer (PreviousSegment & segment) | 
preprocess (RestSegment, CurrentSegment) ] . 

parse (ParseString) := 

closed_interpretation (preprocessed (ParseString) & 

cache (grammar_and_lexicon) ) . 



As is to be expected, extending the OLPM parsing task to cover optimization adds new complications. 
We can no longer be sure that a nonzero intersection with grammar and lexicon signals grammaticaUty; 
such a result merely means that the parse string is consistent with one member of the set of alternatives to 
be optimized over. To be sure, this is still a welcome improvement over the parsing problem that would 
obtain in an aU-default framework like OT, where the notion of consistency plays no role at all. However, 
it means that a second step must be added to the par se step from (27). 
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That step consists first of the extraction of categorial information from the annotated parse string and 
then using that information to generate an optimal surface result via apphcation of Bounded Local Opti- 
mization. If this optimal result and the preprocessed parse string intersect, fine; if not, it means that the 
parse string is ungrammatical. Extraction itself can be performed by composing (o) the annotated parse 
string with a simple transducer that maps segmental symbols to their maximally underspecified repre- 
sentatives and preserves the identity of all other symbols. With some caching of intermediate results we 
can prevent doing double work in our optimizing parser. Note also the use of Bounded Local Optimization 
bio which needs to know its LookaheadConstant. Here then again comes a code fragment that shows 
optimizing_parse in all its glory: 

(28) Parsing in the presence of optimization 



extraction := [ { identity (consumer (repeat) ) , % type identity 

identity (consumer (skip) ) , % ... ditto 
identity (consumer (segment) ) % ... ditto 
} *, 

$@:$@ * % token identity 
% elsewhere, i.e. 
% in mapping 
% categorial info! 

] . 

optimizing_parse (String, LookaheadConstant) := 
bio ( ( cache (parse (String) ) 
o 

extraction 

) 

& grammar_and_lexicon, 

LookaheadConstant) 
& parse ( String) . 



We could call the preceding proposal a kind of analysis-by-synthesis approach (cf. Walther 1998 for further 
discussion in a feature-logical setting). Given these initial results, there clearly is a need for further research 
into parsing under optimization. In particular, one should investigate its efficiency in realistic cases and 
conduct a careful implementation that makes use of lazy automaton intersection (Mohri, Pereira & Riley 
1998). 

5 Implemented Case Studies 

In this section I will discuss worked examples from three languages that illustrate the interplay of the 
various enrichments and mechanisms proposed above. To be maximally concrete, snippets from the actual 
implementation are provided. The notational format is that of the FSA UtiUties toolbox (van Noord 1997), 
a subset of which is depicted in (29). 
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(29) Format of regular expression operators 

[ ] empty string 

{} empty language 

Lower: Upper pair 

[E1,E2, ...,En] concatenation of El, E 2 . . . , En 

{E1,E2, ...,En} unionof E1,E2, . . . , En 

E * Kleene closure 

E+ Kleene plus ( [E, E* ] ) 

E " optionality 

El & E2 intersection 

RelA o Re IB composition 

identity (E ) identity transduction 



Significantly for our purposes, FSA Utilities offers the possibility to define new regular expression op- 
erators. Departing from the original macro ( Head, Body ) notation I use the infix expression Head : = 
Body - to be read as "Head is substituted by Body" - for reasons of better readibility. Macro definitions 
may be parametrized with the help of Prolog variables in order to define new regular expression operators 
in terms of existing ones. Also, Prolog hooks in the form of definite-clause attachments are provided to help 
construct more complicated expressions which would be too cumbersome to build using the above facilities 
alone. Finally, it is sometimes of importance that regular expression operators are alternatively definable 
through direct manipulation of the underlying automata. Again, here the toolbox provides abstract data 
types that support access to alphabets, states, transitions etc. 

I will take liberty in sometimes suppressing macro definitions whose details are not essential to the 
discussion at hand, resorting to descriptions in prose instead. Also, for the sake of brevity the type hierar- 
chy that structures the alphabet will not be displayed separately, which can be justified on the ground of 
mnemonic type names that make it obvious what the hierarchy would be like. 

5.1 Ulwa construct state infixation 

Ulwa is an endangered Misumalpan language spoken in Eastern Nicaragua. The purpose of this section 
is to analyze the placement of possessive infixes in nouns, since "Ulwa serves as a nice example of a 
language in which infixation is clearly sensitive to prosodic structure" Sproat (1992, 49). While Ulwa data 
have been discussed in the Uterature for some time (e.g. McCarthy & Prince 1993 and Sproat 1992), an 
up-to-date descriptive reference has only recently become available (Green 1999). Green shows that Ulwa 
nouns can participate in a syntactic construction called construct state, "a cover term for an entire paradigm 
of genitive agreement inflection" (ibid., 78) where the head noun is marked morphologically by affixation. 
The affix shows inflection for person and number (30). The primary semantics expressed by the construct 
is possession. 

(30) Forms of the Construct- state Affix 



Person 




pl. 


1st 


-ki- 


-ki-na exclusive 






-ni- inclusive 


2nd 


-ma- 


-ma-na 


3rd 


-ka- 


-ka-na- 



(31) shows some data for the third person singular affix (-)ka-, collected from McCarthy & Prince 
(1993, 105) and Sproat (1992, 49) and checked against the dictionary in appendix B of Green (1999).^'' 

^Sproat additionally cites goad, gaad-ka 'God', while Green's dictionary completely lacks g-initial words; this is because he con- 
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Long vowels are represented as W, which both simpUfies the statement of heaviness and eases the actual 
analysis. 

(31) Ulwa construct state suffixation/infixation 





his/her/its N 




has 


bas-ka 


'hair' 


kii 


kii-ka 


'stone, rock' 


taim 


taim-ka 


'time' (preferred: aakatka) 


sapaa 


sapaa-ka 


'forehead' 


suulu 


suu-ka-lu 


'dog' 


asna 


as-ka-na 


'clothes, dress' 


paumak 


pau-ka-mak 


'tomato' 


waiku 


wai-ka-ku 


'moon, month' 


siwanak 


siwa-ka-nak 


'root' 


arakbus 


arak-ka-bus 


'rifle,gun' {Spanish: arquebus) 



According to the descriptions of McCarthy & Prince and Sproat, primary stress in Ulwa falls on the first 
syllable, if it is heavy, otherwise on the second syllable from the left. In Ulwa, the core syllable template is 
(C)V(V)(C) with a small set of exceptions that exhibit complex onsets or codas. Syllables count as heavy 
iff they are either closed off by at least one consonant {has) or contain more than one vowel {kii,pau,taim). 
Monosyllabic words are always heavy. 

From this description and the data in (31) alone it would follow that the possessive affix is invariably 
located after the stressed syllable, emerging as a suffix after heavy monosyllables and as an infix otherwise. 
Note that the affix itself is never stressed. The immediate goals of the analysis to be developed below will 
then be to formalize both this stress distribution and the morphemes involved. 

Before we can do that, however, we should note that the fuller picture that Green (1999) presents both 
for the infixation/suf fixation behaviour and the stress facts does add some complications. There are many 
cases of free variation between suffixation and infixation (kubalamh-ki ^ kiiba-ki-lamh 'butterfly'). Stems 
show two-way exceptions, some taking suffixes exclusively although infixation should a priori be allowed 
(tiwiliski-ka, *tiwi-ka-liski 'sandpiper'), a small set also tolerating infixation although it should a priori 
be ruled out {ta-ka-pas 'mouth'). The same goes for stress which can sometimes oscillate i'baka, ba'kaa 
'child'), while on other occasions preceding ( sarig 'avocado') or following {tas'laawan 'needlefish') the 
locus predicted above. Interestingly, construct state formation may involve accentuation of the affix in a few 
exceptional cases (ma- 'ka-lnak 'payment'), can even disambiguate alternating stress {ba'kaa-ka, * 'baka-ka 
'child') and cause stress shift in a number of pseudo-reduplicative root shapes {kiUilih kiliUh-ka 'cicada'). 
We refer the reader to Green's extensive discussion for further study, concentrating on the core cases in the 
analysis to follow. 

In a first step, the original, disjunctive formulation of the stress generalization can be simplified by 
stating that the syllable containing the second mora from the left must be stressed. According to morale 
theory (Hayes 1995), a mora /i is an abstract unit of syllabic weight which figures prominently in the 
analysis of stress systems of many of the world's languages. Thus, reference to moras is wellfounded in our 
context. To exemplify: ha^s^, ta^i^m^, a^s^.na^ all receive stress on the first syllable, while sa^.pa^a^j,, 
kUf^.lUf^.lUfj^kf^ are accented on the second syllable. 

To facilitate identification of moras, we will take an intermediate step by tagging each segment with 
the relative difference in sonority. That is, we wiU mark whether sonority is rising, falUng or level when 
comparing each segment with its right neighbour. Recall that sonority is an abstract measure of intrinsic 
prominence for speech sounds. While it is customary to employ sonority for determining full syllable 

eludes that - given only a single native counterexample, aaguguh 'song,sing' -- /g/ is not a phoneme of Ulwa. There is no contradiction 
here because Green acknowledges that /g/ exists in a few obvious loan words. McCarthy & Prince erroneously cite the form kulu- 
ka-luk from pseudo-reduplicative kululuk 'lineated woodpecker', which Green (1999, 54f) marks as ungrammatical since "speakers 
seem to recognize them as reduplicative in form, making these stems resist the infixation process." 
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structure, which in turn then serves as the input to foot structure and stress computation, it is possible to 
bypass all higher-level structure in the case at hand.^^ Also, for current purposes we can conflate most of 
the distinctions of Blevins (1995, 21 l)'s nine-positional sonority scale, keeping only consonant <C vowel. 
Given that scale, each segment is tagged with one of {up, down}, where the tag depends on the sonority 
value of its right neighbour: if the right segment's prominence is higher, up is used, while down is assigned 
in the case of lower or same sonority. The final segment, which has no natural right neighbour, is marked 
with down. To exemplify: the Ulwa word for 'clothes' will be tagged adownSdown'nupadown and 'stone' is 
marked as kupidownidown- The crucial observation now is that morale segments are exactly those that are 
tagged with down. 

This observation has obvious repercussions on the formalization of Ulwa stress below: 

material_is (Spec) := [consumer (Spec) *] . 
stress := sonoritY_dif f erences & 

[ pre_main_stress, main_stress +, post_main_stress " ] . 
pre_main_stress := [non_moraic *, mora, non_moraic *] & 

material_is (unstressed) . 
post_main_stress := [non_moraic, material_is (anything) ] & 

material_is (unstressed) . 

non_moraic := consumer (up) . 

mora (Spec) := consumer (down & Spec) . 

mora := mora (anything) . 

main_stress := nor a ( st r es sed) . 



Unsurprisingly, stress is built upon computation of sonority-differences. Note next how the 
stress pattern itself is initially decomposed into zero or more unstressed non-moraic onset segments fol- 
lowed by the first mora followed by more non-moraic material (pre_main_stress). Thereafter comes 
an obligatory stretch of stressed morale material delimitated by an optional block of post-stress segments 
whose start is signalled by a non-moraic segment. Observe that there will be multiple adjacent stress marks 
if the syllable hosting the second mora is not monomoraic. In other words: the whole rime of the accen- 
tuated syUable is formally marked as stressed (e.g. bunstresseddstressedSstressed), in what amounts to the 
expUcit linear equivalent of the stress feature percolation or structural referral to as that is implicit in more 
traditional approaches. Also, stress is contingent on the availability of independently introduced lexical 
material, hence everything is encoded as consumer-type information. 

With stress assignment already given, it is now fairly easy to define the affix itself. 



possessive_third_singular := 

add_repeats (contiguous ( [consumer (stressed) , 

producer (k & unstressed), producer (a & unstressed)])). 



Its segmental content ka is simultaneously marked as unstressed, in accordance with the surface facts (mod- 
ulo the small number of exceptional words of type ta- kaa-pas mentioned above, for which an allomorph 
would have to be set up). Of course, this is producer information. Additionally, in this analysis the affix 
receives a prosodic subcategorization frame: its left context restriction mentions an immediately adjacent 
stressed segment. As a contextual requirement, it must be encoded using the consumer macro. The whole 
tripositional sequence is wrapped with two more macros: contiguous introduces edge-only self loops 

^'As an aside, note that evidence for liiglier-level prosodic structure above the syllable role level is often surprisingly weak, in 
stark contrast to the wholesale adoption of the entire prosodic hierarchy (McCarthy & Prince 1991) throughout most of the generative 
literature. In the case of Ulwa, for example. Green (1999, 64) admits that iterativity in noun stress - usually held to be a basic reflex 
of foot formation - rests on inconclusive evidence from three forms only. 

Also, for a full phonological grammar encompassing syllabification, computation of relative sonority differences is independently 
needed, since it forms an essential first step in the declarative syllabification schemes of Walther (1993), Walther (1997). While 
these were originally couched in feature logic, a non-weighted finite-state version is both easy to implement and attractive due to 
its conceptual simplicity, as there is no need for a simulation of the Maximum Onset Principle (which complicated Mohri, Riley & 
Sproat (1996, 139)'s weighted finite-state syllabification for Spanish). However, the details are beyond the scope of this paper. 
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to permit infixal behaviour on the one hand while disallowing internal breakup of -ka- on the other hand. 
Ob top of this add.repeats modifies the automaton as described in §4 to allow for possible uses in 
reduplication. According to Green (1999), Ulwa does indeed exhibit reduplicative constructions, although 
the details are beyond the scope of this section. 

Encoding of stems is not complicated, yet contains some points worth noting: 



discontiguous_lexeme (L) := 

add_repeats (discontiguous (lexeme (L) & stress)) 
hair := discontiguous_lexeme ( "bas " ) • 
forehead := discontiguous_lexeme ( " sapaa" ) • 
root := discontiguous_lexeme ( " siwanak" ) . 
gun := discontiguous_lexeme ( "arakbus" ) • 



The first point is that, of course, stems must tolerate discontiguity in order to host infixes: 
discontiguous therefore adds self loops at all positions, not only at the edges. 

The innermost lexeme macro converts a Prolog string into a concatenation of producer-type segmental 
positions, working off the assumption that all symbols represent defined segmental types. 

Now the most interesting second aspect is that the undecorated string automaton corresponding to the 
lexeme itself is intersected with the st re s s constraint. This is nothing but the constraint-based equivalent 
of a lexical rule application for stress asignment, applied to stems in isolation. The reason for assuming 
lexically stressed stems is that we want to rule out coalescence, i.e. amalgamation of segmental material, 
between affix and stem segments. As the affix contains the segments k and a, it could in principle overlap 
the k in 'gun' (*ara-ka-bus) or the (pen)ultimate a in 'forehead' (*sap-ka-a, *sapa-ka). Note that such 
overlapping placement would still satisfy the affix's prosodic requirement, as the immediately preceding 
segment in these examples is indeed the second mora from the left and therefore would be surface-stressed. 
To be sure, coalescence as such is attested in other languages (e.g. in Tigrinya, Walther 1997). However, it 
must be forbidden in our Ulwa construction. The solution is now easy to understand: by lexically stressing 
the stems, the left context of the infix in the ungrammatical coalescent realizations of 'gun' and 'forehead' 
is fixed to unstressed, hence will properly conflict with the stressed requirement of the possessive 
affix and lead to the ehmination of the informed disjuncts. Note that another, morphological solution to 
the coalescence problem would have been to tag affix and stems contrastively, e.g. as —kaUa— versus 
SsasPsastts (cf. Ellison 1993). Interestingly, at least in this case such purely technical diacritics seem not 
be required; rather, we can profit from a clean phonological solution. 

There is one last aspect which needs our attention, and that concerns banning discontiguous main stress. 
Even with lexical stress assignment, a heavy stem syllable such as as receives two formal stress marks, and 
- being interruptible - could therefore satisfy the prosodic requirements of the possessive in two ways: *d- 
ka-s, \ds-ka-. Continuing with our phonological solution, the cure is again immediate: surface and lexical 
stress must not conflict! Since the formulation of stress ensures contiguous main stress by way of the 
kleene plus operator, and because it is descriptively true that lexical stem stress coincides with the word- 
based surface stress pattern (again modulo the exceptions noted above), we can simply impose the stress 
constraint once again on the whole infixed word to ensure full wellformedness of the Ulwa possessive 
construction: 



word (Stem) := Stem & possessive_third_singular & stress, 
stems := {hair, forehead, root, gun}. 

possessive_nouns := closed_interpretation (word ( stems )) . 



Note that it is the intersection of all constraints that defines a word. After expanding that macro in the body 
of possessive_nouns with (the disjunction of) defined stems as actual argument together with prun- 
ing away of unsatisfied consumer arcs through closed-interpretation, we arrive at an automaton 

^^Recall that this refers to the core cases; exceptions that ban infixation would require a different treatment. One way would be to 
parametrize discontiguous for the actual content of self loops - currently the entire alphabet S - which then could be restricted 
to technical symbols that are incompatible with the segmental content of affixes. 
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which contains exactly the desired grammatical surface strings {baska, sapaaka, siwakanak, arakkabus}, 
of course enriched with stress and sonority information. 

At this point an additional remark seems appropriate. Thomas Green's reference work sunmiarizes the 
underlying cause of the construct-state phenomenon in derivational parlance as follows: "... the construct 
morphology itself does not receive stress, and does not cause shifts in the stresses of the material which 
follows it . . . it is as if the infixation takes place at a point in the derivation after the metrical structure 
[i.e., stress, M.W.] of the word has been determined" (Green 1999, 64f). 1 take it to be quite satisfying that 
the iterative process of formal grammar development, while seemingly being driven by technical problems 
very unlike those of a descriptive grammarian, nevertheless led to the same fundamental conclusions. 

The observant reader will have noted that the analysis presented so far is not based on the notion of drift, 
hence does not need to make any use of optimization and the concomitant representational extensions. As 
promised in §4, we will now show what the drift-based alternative looks like.As it turns out, the experience 
is highly instructive and sheds new light on the pros and cons of optimization. 

The first step in such an optimization-based analysis of the same facts of Ulwa construct-state infixation 
consists in distributing the appropriate weights to both the affix and the stem, which in turn is contingent 
on the direction of drift we would like to see in the data. Contrary to what McCarthy & Prince (1993, 107) 
suggested with their use of the RiGHTMOSTNESS OT constraint, I would argue that the proper direction 
is leftward. The reasoning is as follows: with lexical stem stress given as before, a leftward-drifting affix 
will correctly 'float' towards the accented position. To prevent the affix from floating past that position - 
which would immediately cause disruption of the lexical stress pattern - we again simply impose the same 
stress constraint on the surface word form. In order to formalize leftward drift, the affix material must be 
weighted cheaper than the stem, hence we will use two weights-as-types unmarked <C marked to that 
effect. 

Here then is the exchanged portion of the stem-defining macros: 



discontiguous_lexeme (L) := 

add_repeats (discontiguous (lexeme (L) & 
material_is (marked) & stress) ) . 



The crucial difference to the previous analysis now is that the affix does not need to be prosodicaUy 
subcategorized! Here is the new definition: 



possessive_third_singular := 

add_repeats (contiguous (lexeme ( "ka" ) & 
material_is (unmarked & unstressed) ) ) . 



The unmarked tagging of affixal tagging is the hallmark of leftward drift, as expected; no further mention 
of stress is needed. The definition of word does not need any changes, since as already noted the same 
need of avoiding surface discontiguities in main stress arises in the present analysis. The only remaining 
difference is, of course, that bounded local optimization (bio, with look-ahead 1) needs to be formally 
applied on the resulting, minimized (mb) automaton: 



optimized_possessive_noun (Stem) : = 

bio (mb (closed_interpretation (word (Stem) )),!)) . 



With stems Uke hair, gun etc. as actual parameters, the same results obtain as in the previous analysis. It 
is instructive to see what the automaton for Ulwa 'gun' - encoding {*arakbuska,*arakbukas, arakkabus} 
- looks like before optimization (32). 
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(32) Weighted automaton for Ulwa 'gun' 



'a-down-unstressed-marked' 



r-up-unstressed-marked' 



' a-down-stressed-marked' 




'b-up-unstressed-marked' \ 'k-up-unstressed-umnarked' 
5 ) fe 

'u-down-unstressed-marked' | 'a-down-unstressed-unmarked' 

b-up-unstressed-marked' 
'u-down-unstressed-marked' 

s-down-unstressed-marked' 



Note how at each bifurcation in the graph the immediate alternative is between an unmarked affix 
versus a marked stem segment; hence look-ahead k = 1 indeed suffices here. Also, as a byproduct of 
pruning arc 4 ^ 5 first, the algorithm in (22) will never explore the set of states {5, 7, 9, 10, 13}; thus the 
incomplete search embodied in BLO does indeed bear fruit in practical cases. 

With both analyses in place, let us sum up. The drift-based optimizing analysis probably holds some 
special appeal to theoretical linguists because one can dispense with affix- specific prosodic subcategoriza- 
tion and cUng to the 'lean lexicon' view that is popular in much of generative Unguistics. Moreover, it could 
be argued to be slightly more explanatory in that it sees nondisruption of lexical stress, a global desider- 
atum for words, as the sole prosodic factor which drives the placement of Ulwa construct-state affixes. 

^^With only slight changes to the formulation of the stress constraint, one could go even further and claim that the unstressedness 
of the construct-state inflectional affixes itself is derivable as well: having less than two moras, CV syllables like ki,ka,ma wiU not 
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One disadvantage of this analysis from a formal point of view is that it requires a canonical, i.e. minimized 
automaton format for appUcation of BLO. BLO itself must be counted as an additional ingredient in the 
analysis which furthermore precludes simple computation of a whole lexicon, as noted above. 

The non-optimizing alternative, on the other hand, avoids all the problems associated with optimization, 
but at the expense of greater representational cost: the affix specification requires explicit mention of the 
left-adjacent stress peak. This in turn makes it harder to justify why exactly this subcategorization happens 
to be crucial in Ulwa morphology, and why e.g. unstressedness on the third segment to the right would not 
be an equally plausible candidate for a possible prosodic subcategorization. On the positive side one should 
realize that local, surface-detectable properties such as adjacency to a stress peak would probably be rather 
easily learnable for both child and machine. If proven to be feasible in future research, such a learnability 
result would have an important impact on the ongoing debate about richness of lexical representations 
versus the need for optimization. 



5.2 German hypocorisdc truncation 

We briefly mentioned in (1) that German provides a productive form of truncation for hypocoristic forms 
of proper names. Taking up the subject again, let us look at a representative sample of the data in (33). 



(33) German hypocoristic i-truncation 



a. Petra / pe:tKa/ > Pet-i b. 

c. Gabriele/gabm'eilg/ > Gab-i d. 

e. Gorbatschow/'goKbatJof/ > f. 
Gorb-i 

g. Imke / imka/ > Imk-i h. 



Andreas /an'dKe:as/> And-i 

Patrizia /pat'Ki:tsiay> Patt-i 

Chruschtschow / kKuJtJof/ > 

Chruschtsch-i 

Hans /'hans/> Hans-i 



All derived forms in (33) end in -i, and aU polysyllabic ones truncate some portion of their base (trun- 
cated part shown in boldface). 

Previous analyses so far have sought to estabUsh a connection between the cutoff point in truncation 
and syllable structure. Neef (1996) and Werner (1996) proposed that the initial portion of the base pre- 
ceding the -i must form a 'potential maximal syllable' . The qualification 'potential' is significant here, 
because - as Gab.ri.e.le 'female first name' vs. Gor.bat.schow 'Gorbatchev' show - reference to actual 
base syllabifications would make wrong predictions {*Gor-i, IGab-i, because .Gorb. is a maximal syllable, 
but *.Gabr. with reversed consonantal cluster is not). However, (33).f,g show that the maximal syllable ap- 
proach misses some crucial data: *.Chruschtsch., and *.Imk. are informed as syllables of German, yet their 
/-suffixed versions are the correct hypocoristic forms. (33). f also rules out another proposal (Fery 1997), 
namely that the relevant criterion should instead be one of 'simple second syllable onset': Chrusch.tsch-i 
has a complex two-member onset /tj/. 

In contrast to these proposals I claim that the simplest correct analysis is again one which makes direct 
use of the subsyllabic concept of sonority. To see the plausibility of this claim, let us first assume that the 
sonority scale for German is as follows (Wiese 1995): 

obstraents <C nasals -C laterals -C rhotics <C high vowels -C nonhigh vowels 

Graphing sonority over time for the critical example Chruschtschow /kKuJtJaf/ in (34), 
receive primary stress when viewed in isolation, e.g. in their lexical entry form! 
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we note next that the last base segment to be retained in the hypocoristic form is the second which 
is located at \he first sonority minimum. A sonority minimum is defined as a segmental position at which 
sonority goes upwards to the right while it does not rise from the left. Observe that the leftmostness ex- 
pressed by 'firsf is not redundant for bases with at least two syllables, because suffixing the characteristic 
ending -/ to the unaltered base would create another minimum. Inspection of the other forms in (33) reveals 
that the sonority-minimum criterion in fact forms a surface-true generalization over German hypocoristic 
truncations.^^ 

Given our sonority-based generalization, we can now put the pieces together to create a formal analysis. 
Here are high-level definitions for our example stem and the hypocoristic constraint: 



chruschtschow := contiguous_lexeme ( "kRUStSOf " ) • 
hypocoristic := sonority_dif f erences & 

[ sonoritY_based_cutof f_point , 
truncated_part , 
characteristic_ending] . 



Although all we really ask of a truncatable representation is tolerance of skipping, chruschtschow is 
defined as a contiguous_lexeme for better reuse of existing macros. Stems will be intersected with 
a hypocoristic constraint which - apart from tagging segments with sonority_diff erences - 
dissects a word into an initial portion up to a sonorityjoased-cutoff .point, followed by a possibly 
empty {\Hans-i) truncated part and finally an obUgatory characteristic_ending. These macros are 
in turn defined as follows: 



sonority_based_cutof f_point := f irst_ ( sonority_minimum) . 

truncated_part := [producer { skip) *]. 
characteristic_ending := producer(i). 

sonority_ininiinuin := [ consumer ( segment & ~ up), consumer (up) ] . 
first_(X) := [not_contains (X) , X]. 



There are, however, three kinds of exceptions: (i) base portion is not string prefix of base word: Birgit > Bigg-i, *Birg-i 
(C1V1C2C3 ... > C1V1C3 ... ) E lisabeth > Liss-i, Barbara > Bahs-i (ii) base portion extends past sonority minimum: De- 
pressiver > Depr-i (?Depp-i); As'phalt-i, Bank'rott-i, Be'deut-i, Altema'tiv-i, Kom'post-i, Ele'gant-i, 'Erstsemest-i (iii) base portion 
stops before minimum: West/Ostdeutscher > Wess-i/Oss-i, *West-i/Ost-i, Hunderter > Hunn-i. As far as I can teU, these exceptions 
are a problem for all other published analyses as well, which is to be expected given the intimate connection between sonority graphs 
and syllable structure. Neglected factors playing a role here seem to include the influence of morphological structure in the form of 
(pseudo)compoimding and avoidance of homonymy, the position of main stress, details of phonetic realization and recognizability of 
the base. Therefore the proposed analysis should be viewed as the core component of a more complete account that integrates those 
additional factors. 
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Of these, only the definition of f irst_ is not entirely straightforward. The idea here is to estabUsh left- 
mostness by excluding via not .contains any occurrences of a particular set of strings X in the -possibly 
empty - material preceding the realization of X. 

Testing the definitions so far reveals an interesting deficiency of the analysis as it stands: it contains an 
unformalized hidden assumption of longest-match behaviour! To see this, consider in (35) the automaton 
that closed-interpretation (chruschtschow & hypocoristic) evaluates to. 

(35) Unoptimized Automaton for hypocoristic Chruschtschow 




Besides the correct realization, two alternative paths starting at nodes 3 and 4 mark unwanted 
realizations that truncate earlier within the medial consonant cluster /JtJ/. These alternatives occur 
in the first place because (i) sonority differences are computed over the truncated surface form, 
(ii) the amount of truncation is left indeterminate by truncated_part, and (iii) a stem-internal 
(-lup)" up configuration like our Udownfpiateautpiateaujup provides n possible truncation points satisfying 
sonorityJoase d_cu t o f f _p o i n t , with no preference given to anyone of them. Now Karttunen ( 1 996) 
reports an interesting simulation of longest-match behaviour for rewrite rules using only finite-state ma- 
chinery. Unfortunately, it appears that his results are crucially dependent on the ability of finite-state trans- 
ducers to describe two-level correspondences, precluding a transfer of his results to the present monostral 
setting. Fortunately, we already have another formal device to express preferences, namely Bounded Local 
Optimization. Observe that the unwanted alternatives are distinguished from their grammatical counterpart 
in that their length-2 subpaths 3 ^ 5 ^ 7, 4 ^ 7 ^ 8 all contain a skip - the hallmark of truncation 
- whereas the corresponding grammatical paths 3— »4^6, 4— >6^8 are 'better' insofar as they 
contain only proper segmental material. Therefore a weight scale segment < skip which feeds into look- 
ahead-2 application of Bounded Local Optimization bio solves our problem by effectively preferring late 
truncation. This leads us to the final formulation of i_f ormation: 



i_formation := 

bio (mb (closed_interpretation (chruschtschow & hypocoristic) ), 2 ) . 



At this point the alert reader may start to wonder whether German hypocoristic truncation admits a non- 
optimizing alternative analysis similar in spirit to Ulwa. The short answer is yes, but with some extra 
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subtleties. Again, such an alternative involves lexical constraint application, this time pertaining to the 
prespecification of sonority differences in stems. 

The crucial observation leading to this is that too-short truncations necessarily entail a conflict between 
the sonority curves of isolated stems and their truncated remainder in the i-suffixed word forms: the vocalic 
suffix provides a new right context which affects the sonority difference of its consonantal left neighbour. 
For example, in kupi^upUdownfpiateautpiateaufup^downfdown the first /J/ is on a sonority plateau because of 
neighbouring /t/ in the stem, but wiU be tagged with the conflicting value up when higher- sonority /i/ forms 
the new right context: *k„pifupUdownfpiateaujiup-idown- 

However, lexical tagging of sonority differences must be appUed cautiously in our surface-true setting 
because of monosyllabic words like Hans ~ Hans-i: since computation of individual sonority differences 
requires examination of the right context segment, the sonority difference of the last stem segment - which 
lacks a natural context - must be lexically underspecified to avoid a conflict down ^ up whenever -i 
adds some further word-level context. Fortunately, this demand corresponds well to a modular version 
of sonority_dif f erences, which separates the boundary_condition responsible for tagging a 
word's last segment with down from the plain_sonority_diff erences that take care of the rest. 
Also, we must make sure that lexical tagging does see a non-truncatable, skip-free representation of the 
stem in order to avoid inadvertent tagging of the set of all possible truncations. The last concern is adressed 
by devising a separate add.skips regular expression operator which is appUed on the result of lexical 
tagging of skip-free stem material. This much being said, we can now present the alternative analysis: 

technical_SYmbols := material_is ( (skip; repeat) ) . 

sonority_diff erences := 

ignore (plain_sonority_diff erences & boundary_conditions , 
technical_symbols ) . 

stringToSegments ( [] , [] ) . 

stringToSegments ( [ASCII I Codes ] , [producer (Segment ) [Segments]) :- 
name (Segment, [ASCII] ) , 
StringToSegments (Codes, Segments) . 

(stem (String) := add_repeats (contiguous (add_skips ( 

Segments & plain_sonority_dif f erences) )) ) :- 
StringToSegments (String, Segments) . 

lexicon := { stem ( "kRUStSOf " ) , stem("hans") }. 

non_optimizing_i_f ormation := 

closed_interpretation (lexicon & hypocoristic) . 



In order to be surface-true despite occasional occurrences of interspersed technical-symbols, in 
particular skip, sonority_diff erences jumps over sequences of such symbols with the help of 
the built-in ignore operator. Note also that the stem macro makes use of the Prolog hook facilities 
( : - StringToSegments (...)) to synthesize a regular expression denoting the concatenation of A'' 
producer-type segments from a more convenient String description of length N, transferring the result 
via another Prolog variable Segments. Finally, observe that the definition of hypocoristic has been 
left unchanged, hence in particular imposing its word-level view of the sonority-diff erences to 
fully specify even the last segment. 

The pros and cons of the two alternative treatments of German hypocoristic truncation are pretty much 
the same as in the case of Ulwa. In particular, as indicated in the code fragment above, we can again 
compute the results of hypocoristic formation over the entire lexicon in one fell swoop. Like in Ulwa, 
careful lexical enrichment was the key to achieve the non-optimizing analysis for German. Although a 
full conclusion seems premature at this point, the need for much more careful argumentation is only too 
apparent when it comes to defending an aU-optimization framework for phonological and morphological 
analysis. 
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5.3 Tagalog overapplying reduplication 



Tagalog, an Austronesian language with 14,850,000 first language speakers, is the national language of the 
Phihppines (Grimes 1996). The language is of particular interest to reduplication theorists because of its 
wide use of redupUcation in productive word formation, coupled with interesting instances of overappli- 
cation of nasal assimilation and coalescence processes on a par with normal application of an intervocalic 
flapping rule. While the specific Tagalog instance was already noted by Bloomfield (1933, 221f), it was 
Wilbur (1973) who coined the generic term ovempplication in her pioneering thesis. There she described a 
class of reduplicative processes interacting with phonological rule applications, where a particular change 
to the base effected by a phonological rule is mirrored in the reduplicant and vice versa, although only 
one of these constituents actually provides the context required by the rule. In derivational terms the rule 
is therefore said to overapply even where its context is not met, whereas with normal application this be- 
haviour would be ruled out. (The converse case of underapplication exists as well, even in Tagalog, but wiU 
not be discussed further due to lack of space). With the advent of constraint-based theories the mechanisms 
for analyzing overapplicational reduphcation have of course changed, but the terminology is still widely 
used for convenience. 

For information on Tagalog reduplication in the context of various generative analyses, see in partic- 
ular Carrier (1979), Marantz (1982), Lieber (1990, ch.4), McCarthy & Prince (1995). According to these 
sources, Tagalog has three major reduplication patterns, termed RA, Rl, R2 in Carrier (1979). Type RA 
prefixes the initial CV portion of the base, accompanied by lengthening of the reduphcant vowel, while 
Rl insists on a short vowel in the same CV reduplicant shape. Type R2 copies longer portions of the base, 
sometimes the entire stem. These three patterns function in a variety of word formation processes, and 
their semantic contribution can only be evaluated with reference to the accompanying affixes and other 
aspects of morphological structure. For example, while any Tagalog verb can undergo RA reduplication to 
receive aspectual marking, RA is interpreted as causative aspect only together with the prefix na-ka-, while 
conveying future aspect in conjunction with certain subject-topic markers such as mag-, -um-. Note that 
Tagalog verb forms always require at least one topic marker. The actual array of permissible topic markers 
must be lexicaUy specified for each verbal stem. 

Because RA reduplication - due to lengthening - shows extra material not present in the base, and 
because it can exhibit both overapplication and normal application effects, it seems to have just the right 
amount of 'real-hfe' complexity for an illustrative implementation. Hence we will focus on this type in 
what follows. 

In (36) we provide a relevant sample of the data (ST/DOT abbreviate Subject/Direct Object Topic 
markers). 

(36) Tagalog CVi reduplication 



a. 


mag-linis 'ST-clean' 


b. 


mag-li:-linis 'ST-will clean' 


c. 


mag-bukas 'open!' 


d. 


mag-bu:-bukas 'will open' 


e. 


nag-bukas 'opened' 


f. 


nag-bu:-bukas 'is/was opening' 


g- 


*na-ka-?antok 


h. 


na-ka:-ka-?antok 'causing sleepiness' 


i. 


?i-pai-pag-bilih 'DOT-will sell' 








?i-pag-bi:-bilih 


k. 


p-um-i:lit 'one who compelled' 


j- 


ma-?i:-?i-pag-linis 'will manage to clean for' 








ma-?i-pa:-pag-linis 


1. 


nag-pu:-p-um-i:lit 'one who makes 




ma- ?i-pag-li:-linis 




extreme effort' 



We can see in (36).a-f that RA reduplication has copied stem material and that its meaning contribution 
varies as a function of the segmental prefix. However, in (36).h-l we find that the same reduplicative pro- 
cess can also repeat affixal material, once again underscoring the phonological character of reduphcative 
copying. Most interestingly, with more than one prefix we get free variation as to which part is reduplicated 
(36).i,j, with only the leftmost affix being exempted from reduplication. To complete the case against a pu- 
tative morphological characterization of RA reduplication, (36).l shows an example where the reduphcant 
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freely combines stem and infix segments. As mentioned before, Tagalog also has an alternation known in 
derivational terms as Nasal Substitution (37). 

(37) Nasal Substitution and Assimilation 

a. /mai)-bilih/ mamilih 'ST-shop' b. /maij-dikit/ manikit 

'ST-get thoroughly stuck' 
c. /mai)-basah/ mambasah 'ST-read' d. /maij-dukut/ — > mandukut 

'ST-pick pockets' 

e. /mag-kai)-dikit/ mag-kan-dikit 
'ST-get stuck accidently as a result of 



The first entries (37).a,b show an assimilation of the place features of the prefix-final nasal to the 
following stop, coupled with a coalescence of the two segmental positions into just one (the substitution). 
However, the coalescence part of the alternation is subject to two-way exceptions: not only do certain 
bases resist coalescence in the presence of coalesceable prefixes like mag- (37).c,d, but also there are 
other prefixes like mag-kaij- which - although they share the final velar nasal - do not coalesce under 
concatenation with the right bases (37).e. We will have to take care of this lexical conditioning in the 
analysis to follow. 

The interaction of Nasal Substitution and RA redupUcation now produces the overapphcation effects 
mentioned above (38). 

(38) Reduplication and Overapplying Nasal Substitution 

a. /pai)-pu:tul/ — > pa-mu-muitul 'that used for cutting' 

*pa-mu-pu:tul (Bloomfield 1933, 221f) 

b. /mai)-ka?ilanan/ — > ma-i)a:-i)a?ilanan 'ST-will need' 

*ma-i)a:-ka?ilanan 

c. /mai)-pulah/ — > ma-mu:-mulah 'will turn red' 

*ma-mu:-pulah 



Although only the leftmost instance of the triggering plosive is local to the prefix-final nasal, its second 
occurrence must be reahzed as a place-assimilated nasal as well: nasal substitution overapplies. To com- 
plicate matters, however, it must be noted that this kind of long-distance dependency between base and 
reduphcant segments is restricted to certain segmental classes: hke in the rest of morphology (39).a,b, two 
occurrences of dental stops created by one of the reduplication patterns can become dissimilated through 
intervocalic flapping, a normally applying phonological 'rule' (39).c-e. 

(39) d ~ r: Normal Application of Intervocalic Flapping 

a. daimot 'stinginess' b. ma-ra:mot 'stingy' 

c. man-dai-ramboi) 'bandit' d. sunud-sunur-in 'no gloss' 

e. d-um-a:-ratii)-datii) 'attends now and then' 



With this last piece of evidence on Tagalog redupUcation we have now assembled enough data and 
generalizations to proceed to the analysis itself. 

^'Bloomfield (1933, 221f) cites the contrasting case of [ta:wa] 'a laugh', with [taitaiwa] 'one who will laugh' turning into 
[tumaitaiwa] 'one who is laughing'. Although here it appears as if RA reduplication would not always copy right-adjacent mate- 
rial - skipping over infixed -um- in this case - we will in fact assume that [ta:taiwa] and similar cases are lexicalized reduplications, 
i.e. new stems, to which -um- infixation regularly applies. If this assumption turned out to be wrong, one whould have to make the 
RA reduphcant discontiguous and condition the contrastive behaviour of pairs like [nag-pui-p-um-i:lit] vs. [t-um-a:-ta:wa] with the 
help of suitable morphological features. 
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Because our aim is to model RA reduplication, let us start with various macro specifications which 
help implement the necessary synchronisation of morpheme edges. The idea here is that all non-floating, 
i.e. more or less concatenative morphemes have their left edge synchronized (type synced) and their 
interior material unsynchronized (~synced), while only the right edge of the stem is synchronized as 
well; bound morphemes remain unsynchronized. Floating morphemes like Tagalog's famous -mot- would 
be special in that they would be underspecified for synchronization, to be compatible with whatever landing 
site their prosodic conditions demand; however, we disregard the additional complexity for the fragment 
under development. Here is the code portion for synchronisation (applications of it will appear in later code 
snippets): 



synced_position (Spec) := consumer ( synced & Spec). 
synced_position := synced_position (anything) . 
synced_producer (Spec) := producer (synced & Spec). 

unsynced_position (Spec) := consumer(~ synced). 
unsynced_position := unsynced_position (anything) . 

unsynced_portion := [unsynced_position *] . 

lef t_synced_portion := [ synced_position, unsynced_portion] . 
synced_constituent := [left_synced_portion, synced_position] . 



Next we need to turn to the representational needs of stems. They should of course be reduplicatable, 
internally discontiguous^" (for later tolerance of floating morphemes), and synchronized at both edges. 
While non-stop-initial stems like linis need only basic segmental specifications in addition to these needs 
to complete their definition, stems like pulah, dikit, ka?ilanan require additional means to implement the 
alternating behaviour of their initial plosives. Recall that in Declarative Phonology destructive processes 
like the one hinted at by 'nasal substitution' are impossible to represent literally, hence must give way to 
alternative representational treatments. The solution here is to underspecify the manner features signalling 
obstruenthood and (non-)voicing in one alternant of the initial segment's specification, with a fully spec- 
ified stop constituting the other alternant. In a cooperative fashion, coalesceable ij-final prefixes will then 
supply the missing nasal manner feature to guarantee full specification after the intersection of all partic- 
ipating morphemes-as-constraints has been done. Finally, to model the default stand-alone realization as 
a plain stop, the fully specified alternant is encoded as a producer, whereas the 'cooperative', underspeci- 
fied alternative is specified as a consumer. With these definitions in place, a wide range of stems can now 
receive their representations with ease: 



discontiguous_stem (Segments) := 

add_repeats (contiguous (internally_discontiguous (Segments) & 
synced_constituent ) ) . 

underspecified (BaseSpec, AddedForFullSpec) : = 

{ producer (BaseSpec & AddedForFullSpec), 
consumer (BaseSpec) }. 

bilih := 

discontiguous_stem ( [underspecified (labial, obstruent & voiced), 

^"Note that internally.discontiguous is a variant of tlie discontiguous operator introduced in earlier analyses that 
spares both start and final states from the self-loop enrichment. It follows that we can safely intersect such a representation with 
additional constraints like our synchronized.constituent without fear of inadvertent constraining effects on peripheral ma- 
terial introduced outside of the morphemic domain under construction. In particular, here we want the edges of a morpheme to be 
synchronized, not the edges of the whole word. Upon wrapping the result with the previously defined contiguous operator, which 
introduces self loops to the start and final states only, we then have an appropriately conditioned partial description of an entire word 
with the required tolerance of other morphemes that may be present. 
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stringToSegments ("ilih") ] ) . 

pulah := discontiguous_stem ( 

[underspecified ( labial, obstruent & ~ voiced), 
StringToSegments ("ulah") ] ) . 

dik.it := discontiguous_stem ( 

[underspecified (dental, obstruent & voiced), 
StringToSegments ("ikit") ] ) . 

kaqilanan := discontiguous_stem ( 

[underspecified (dorsal, obstruent & ~ voiced), 
StringToSegments ("aqilanan") ] ) . 

basah := discontiguous_stem ( stringToSegments ( "basah" )) . 

dukut := discontiguous_stem ( StringToSegments ( "dukut ")) . 

guloh := discontiguous_stem ( stringToSegments ( "guloh" )) . 

laakad := discontiguous_stem ( stringToSegments ( "laakad" )) • 

linis := discontiguous_stem (stringToSegments ( "linis" )) . 

dambong := discontiguous_stem ( [producer (voiced & dental), 
StringToSegments ("amboN") ] ) . 



With the previous remarks on underspecification as a suitable strategy for (many) seemingly destructive 
alternations, the reader will have no difficulty to understand the definition for dambong, not mentioned 
before; here again missing maimer features will be supplied from later constraints to guarantee full speci- 
fication of its initial voiced dental, fleshing it out either as a stop or as a flap. 

With the definitions for stems in place, let us concentrate next on interesting prefixes, the foremost of 
which are the n-final ones. They present a good case for Ellison (1993)'s claim that intersective morpheme 
combination is often to be preferred over concatenative one, since part of their definition is the nasalhood 
imputed on suitable stem-initial consonants (producer (nasal) ), i.e., an underspecified contextual re- 
striction that must be realized outside of their own morphemic interval. Because they are also ordinary seg- 
mental prefixes, however, these morphemes are specified as contiguous synced_constituents. 
What makes them special is that - even when coalescence of nasal and stop features is not possible, 
like in mam-basah, *ma-masah - the nasal must assimilate in place to the following obstruental stop, 
if any (otherwise receiving a default place, namely dorsal); hence the other prefix-final alternant la- 
belled assimilated_nasal_obstruent_sequence.-" A non-coalescing prefix like magkang dif- 
fers minimally in dispensing with the underspecified nasal alternant, but retains the assimilation part, as 
demanded by the data in (37): 



ng_final_prefix (String, Specif iedAlternant) := 
contiguous ( synced_constituent & 

[stringToSegments (String) , 
{ Specif iedAlternant, 

^'A subtle detail needs our attention here: the conjunction of synced.constituent with the two-way alternating segmental 
specification correctly marks the rightmost segment as synced, irrespective of the ultimate length of the prefix (length 2 for the 
coalescent case, length 3 for the pure assimilation case). This rightmost segment is nothing else but the beginning of the stem, hence 
correct intermorphemic alignment can be ensured with the help of one additional constraint (to be detailed later) which governs the 
correct distribution of synchronisation marks. 
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assimilated_nasal_obstruent_sequence }]) . 
coalescent_nasal := producer (nasal ) . 

assimilation_only := {}. % empty language = discard alternant! 
mang := ng_f inal_pref ix ( "ma" , coalescent_nasal) . 
pang := ng_f inal_pref ix ( "pa" , coalescent_nasal) . 
magkang := ng_f inal_pref ix ( "magka" , assimilation_only) . 



The place-assimilating behaviour itself must be simulated in a piecewise fashion for all the possible place 
categories involved (place ( [labial, dental, dorsal] ) ), because of the well-known lack of token 
identity in regular description languages. We can regain some expressivity, though, by making use of a re- 
cursive macro assimilation_f or to off-line-synthesize the iterated disjunction from the hst of places 
of articulation. The trick is to exploit the fact that Prolog is the host language of our finite-state toolkit, 
which means that we can use token identity at the description level, though not at the object level. Employ- 
ing the power of logical variables we therefore distribute the variable SharedCategory to each disjunct 
(which will of course become bound at compile time). 

Putting together the pieces in assimilated_nasal_obstruent_sequence again requires a ju- 
dicious use of producer- and consumer-type information to distinguish the lexical contribution of the mor- 
pheme itself - which is the nasal part - from the contextual requirement participating in assimilation - 
which is the obstruent part. Finally, a default disjunct encodes the dorsal reaUzation that is observed 
whenever a suitable obstruent stop is lacking: 

place ( [labial, dental, dorsal] ) . % Prolog fact holding 

% list of categories 

(place_assimilation := assimilation_f or (Place) ) :- place (Place) . 

assimilation_f or ( [ ] ) := '{}'• 

assimilation_f or ( [ SharedCategory | Categories ] ) : = 

{ [consumer (SharedCategory) , consumer (SharedCategory) ] , 
assimilation_f or (Categories ) }. 

assimilated_nasal_obstruent_sequence : = 

{ [producer (nasal ) , consumer (obstruent) ] & place_assimilation, 
default } . 

default := [producer (nasal & dorsal) , consumer ( ~ obstruent)]. 



Before we proceed to add the reduplicative part to our growing Tagalog fragment, let us test the definitions 
so far to obtain some highly instructive intermediate results. With a little additional code shown below 
to ensure wellformed synchronization, we can in fact form meaningful intersections Uke word (mang & 

bilih) . 

matching_synchronisation := 

not_contains2 (two_synced_segments_in_a_row) . 
two_synced_segments_in_a_row := 

[consumer (synced) , consumer (synced) ] . 

generic_word_constraints := matching_synchronisation . 

word(Expr) := closed_interpretation (Expr & generic_word_constraints) . 
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However, the surprising result is that - with coalesceable stems and alternating i)-final prefixes - we actually 
get two results (40). 

(40) /mai)-biHh/: Nasal Coalescence Problem 




Besides the desired coalescing alternant m (2 ^ 4) we get an unwanted alternative mb (2 — * 3 ^ 4) 
that does not merge the two segmental positions. Some reflection reveals that this behaviour is unavoidable 
given our modelling assumptions about independent and compositional specification of morphemes. Since 
the prefix must combine with both the nasal-substituting bilih and the non-substituting basah, the sequence 
mb cannot be ruled out categorically, hence both alternants must remain in the denotation of the affix. Like- 
wise, an alternating stem like bilih needs to tolerate non-nasal realizations of its first segment in addition 
to the coalescent nasal case, to cater for isolated pronunciation or other prefixal options. If we devise an 
abstract feature [± coalesce] to characterize both stems and prefixes, and relate the alternating case to an 
underspecified feature value [0 coalesce], we can illustrate the scenario with the following paradigm (41). 

(41) Prefix-stem Combination Paradigm 



Prefix — > 
Stem I 


[- coalesce] 


[0 coalesce] 


[- coalesce] 


magkam-basah 
[- coalesce] 


mam-basah 
[- coalesce] 


[0 coalesce] 


magkam-bilih 
[- coalesce] 


ma-milih 
[+ coalesce] 



The paradigm reveals that we actually demand full specification from the (intersective) combination of 
two underspecified features [0 coalesce], an impossibility in a monotonic setting where underspecification 
is equivalent to a (systematic) disjunction of fully specified disjuncts. This state of affairs has been noticed 
before: replacing 'coalesce' with 5, the featural part of the paradigm turns out to be a verbatim copy of 
the one discussed in (Ellison 1994a, p. 31, ex. (15)) under the general rubric "paradigm[s that] might be 
decomposable into morphemes only with the use of defaults."! With Tagalog Nasal Substitution we have 
merely uncovered a first concrete instance of the abstract case predicted by Ellison. The default in our case 
would be [+ coalesce]. 

Fortunately, optimization again comes to our rescue to implement the required default behaviour Note 
that the default preference for coalescence amounts to an eager choice of synchronized morpheme begin- 
nings in our representational setting. This is illustrated well in the critical choice between unsychronized 

Till ' Yfi ' X 

2-^3 and synchronized 2 4 in (40). Therefore, Bounded Local Optimization parametrized with 
look-ahead 1 and a weight scale -^synced > synced completely solves our little problem. 

With Nasal Substitution handled on its own, let us move on towards a declarative formalization of 
type RA reduplication with overapplication. Because we again see reduplication as partial specification 
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of an entire word, we construct it as follows. Before the copy we need to expect at least one (prefixid) 
morpheme, possibly more (36). After it we must allow for the rest of the base and also tolerate whatever 
morphemic material finishes off the word (especially in the case where one of the prefixes is targeted for 
reduplication and the stem must still be incorporaled). Within the reduplicant-base compound we need to 
enforce a kind of long-distance agreement between the quality of the first redupUcated segment and its 
base counterpart to model 'overapplication' . However, as we have seen in the discussion of flapping, in 
a surface-true perspective this kind of agreement must be carefully constructed to hold only over certain 
subcategories of the segmental makeup and spare the dimensions involved in (non-)flapping. 

On the face of it, the type_RA_reduplicant itself copies the first two base segments - which hap- 
pen to form a CV sequence - and lengthens the V. To do so formally, we can identify the initial base segment 
as a synced-position, hence its successor is of course an unsynced.position. For lengthening 
of the vowel, we have chosen a formalization which makes use of the repeat arcs to step back one po- 
sition and then proceed forwards again by providing for an abstract vowel position, which can only be 
the vowel we have targeted already before. After that, a kleene-plus- wrapped producer ( repeat ) steps 
all the way back to the beginning of the base, which again is characterizable as a synchronized segment. 
However, to make sure that consumer-type underspecified alternants as contained in the first segment of 
bilih will survive closed_interpretation, we have been careful here to specify this beginning of 
the base proper with a matching synced_producer. Like in the nasal assimilation instance, the seg- 
mental positions agreeing in an overapplicational manner are again abstracted out via the logical variable 
AgreeingFeatures. Enforcing the agreement in a reasonably perspicuous way is done via the Prolog 
hook mechanism of FSA Utilities; the respective predicate enf orce_agreement_in_/ 3 constructs an 
iterated disjunction abbreviating the repeated realization of the abstract redupUcant parametrized for each 
of the five values of the overapphcational categories. 

This extensionalization of agreement is indeed the price to pay for a framework that does not handle 
(possibly overridable) true token identity at the object level, as demonstrated by (Beesley 1998). That pa- 
per is relevant here because it contains a careful study of the problem of long-distance morphosyntactic 
dependencies in a finite-state framework together with potential remedies; one could indeed apply some of 
the solutions proposed there to our present task of modelling nonlocal phonological dependencies. In par- 
ticular, the use of weakly non-finite-state enhancements like global registers that can be set (for initializing 
an agreeing feature) and tested against (an existing feature value) should be a profitable amendment, once 
residual problems with automaton transformations like minimization in the face of enhanced automata have 
been settled. 

Here then is the central code portion responsible for Tagalog RA redupUcation: 



some_morpheme := lef t_synced_portion . 
pre_base_morphemes := [ some_morpheme +] . 

rest_of_base := unsynced_portion . 

rest_of_word := [ some_morpheme *, synced_position] . 

( ra_reduplicated_word := [pre_base_morphemes , Reduplicant, 
rest_of_base, rest_of_word] ) :- 
enf orce_agreement_in_ (type_RA_reduplicant , 
[ nasal, % 1: /m,n,N/ 

nasal & ~ voiced & labial, % 2: e.g. /p/ 
nasal & ~ voiced & ~ labial, % 3: e.g. /t/ 
nasal & voiced & labial, % 4: e.g. /b/ 
nasal & voiced & ~ labial], % 5: e.g. /d/ 
Reduplicant) . 

type_RA_reduplicant (AgreeingFeatures) := 
[ synced_position (AgreeingFeatures ) , 
unsynced_position, producer (repeat ) , 
unsynced_position (vowel) , producer ( repeat ) +, 
synced_producer (AgreeingFeatures ) ] . 
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enf orce_agreement_in_ (_MacroName, [] , '{}')• 
enf orce_agreement_in_ (MacroName, [Cat | Cats ] , 

{ Macrolnstance, MoreMacroInstances }) :- 
Macrolnstance =.. [MacroName, Cat ] , 

enf orce_agreement_in_ (MacroName, Cats, MoreMacroInstances). 

optimal_word (Expr ) := 

bio (mb (closed_interpretation (Expr & word)),l). 



We are now in a position to show in (42) the actual outcome of an automaton specification such as 

optimal_word (mang & bilih & ra_reduplicated_word) 

(42) /mar)-RA-bilih/ [mamiimifih]: RA, OVERAPPLYING 




With the addition of some ordinary prefixes we can even model the free variation in reduplication 
arising from the presence of multiple prefixes. For ease of exposition these will be straightforwardly con- 
catenated [mag, qi , pag, ... ] rather than being intersected. The intersective modelling variant would 
make use of the method proposed in Ellison (1993) to simulate concatenation with the help of suitable 
tagged representations. 



prefix (String) := 

add_repeats ( [stringToSegments (String) & some_morpheme, 
material_is (anything) ] ) . 

qi := prefix ("qi") . % ''q' denotes glottal stop 
pag := pref ix ( "pag" ) . 
ma := prefix ("ma") . 



The result of word ( [ma, qi , pag, linis ] & ra_reduplicated_word) is depicted in (43). 
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(43) Free variation in RA reduplication 




It turns out that we need to do something else to preserve this result under BLO application 
(optimal_word ( ... ) ). The reason is that we will otherwise loose one of the alternants due to a 
nonsensical comparison to a repeat-initated alternative path, no matter what weight we assign to repeat: it 
cooccurs with both the larger (4 5) and the smaller weight (7 9) in the automaton of (43). This then is 
a perfect case for applying the slight modification to weight computation proposed on page 20, fn. 20: be- 
cause we want repeat arcs to behave as inert with respect to BLO, we assign them infinite negative weight, 
thereby exempting them from ever getting pruned. Remember that this welcome result follows because the 
essence of the modification is that weight-sum comparison now only looks for better positive alternatives. 

The final piece of the analysis handles the normal application of the flapping 'rule' : only intervocalic 
voiced dentals are flapped [r], elsewhere they are pronounced as a stop [d]. In the context of reduplica- 
tion the rule applies uniformly to both base and reduplicant, which translates into simple intersection of a 
surface-true constraint f lap-distribution with the word. Here it is actually helpful that our frame- 
work is limited to type identity, since the repetition of previous material will not copy particular allophonic 
choices in either base or reduplicant, allowing the flap constraint to rule freely. This constraint is defined 
below: 



nonf lapped_intervocalic_dental_stop : = 

[ consumer (vowel ) , consumer (d) , consumer (vowel )] . 

preCflap := [consumer (flap) , consumer (~ vowel & segment)]. 
postCflap := [consumer C vowel & segment), consumer ( flap) ] . 

no_peripheral_occurence_of (Segment) := 

([ consumer ( segment & ~ Segment), [material_is (segment) , 
consumer ( segment & ~ Segment)] " ] " ) . 

ignore_intervening_technical_symbols (Expr) := 
ignore (Expr, material_is (misc) ) . 



47 



f lap_distribution := 

ignore_intervening_technical_SYmbols ( 
not_contains3 (nonf lapped_intervocalic_dental_stop) & 
not_contains2 (preCflap) & 
not_contains2 (postCflap) & 
no_peripheral_occurence_of (flap) ) . 



Observe that the constraint is build up (in an admittedly somewhat roundabout way) from negative subcon- 
straints banning intuitively sensible subconfigurations where a flap must not occur. This is done by way of 
parametrized constraints not_contains2/ 3 that ban occurrences of its length-2/3 string arguments (de- 
tails suppressed for the sake of readability). After that has been done, the resulting automaton is modified 
to ignore_intervening_technical_syinbols, which will occur especially in reduplication. 

With flap distribution in place, we have removed the last barrier to a fairly complete account of type 
RA reduplication, as we can now demonstrate even the correct outcome of 

optimal_word (mang & dambong & ra_reduplicated_word & 
f lap-distribution) , 

which in compact regular expression notation is 

[m, a, n, d, a, repeat, a, repeat, repeat, repeat *, ' D' , a, m, b, o, ' N' ] • 

Note both the nonflapped d that starts off the reduplicant after the consonant-final prefix and the flapping 
of ' D' in the base, which correctly happens even across intervening repeat symbols, because vowels 
a_a form its ultimate segmental context. 

6 Discussion 

As we have seen in previous sections, the one-level approach presented in this paper offers solutions to a 
wide range of analytical problems that arise in modelling prosodic phenomena in morphology. Enriched 
representations allow for repetition, skipping and insertion of segmental material. The non-finite-state oper- 
ation of reduphcative copying is mapped to automata intersection, an operation whose formal power is also 
beyond finite-state, but which is independently needed as the most basic mechanism to combine constraints. 
The distinction between contextual requirements and genuine lexical contribution is reflected in differen- 
tiating between consumers and producers of information, a move that introduces resource consciousness 
into automata. Bounded local optimization supports the formulation of optimizing analyses that seem to 
be called for in a range of empirical cases. A generic recipe for flattening prosodic constituency provides 
the basis for maximally local access to higher phonological structure. The architecture supports both gen- 
eration and parsing. Finally, the proposal proved its worth in three practical, computer-implemented case 
studies. 

As is to be expected with any new proposal, however, there are areas that deserve further investigation, 
limitations that need to be overcome and alternative design decisions that should be explored. 

First, the repeat-arc enrichment that facilitates reduplication assumed that base strings are separately 
enriched, and not their union as a whole, because only then can inadvertent repetition of parts of some 
other base be excluded. Somewhat loosely speaking we have relied on the fact that the weaker notion of type 
identity available in finite-state networks can still mimick token identity if there is only one type to consider, 
i.e. in moving back and forth in time one sees only one base string at a time. A drawback of this mode of 
enrichment is that the set of base strings cannot be compressed much further, as would be possible in other 
cases of finite-state modelling. However, recall that the repeat-arc encoding imposes a highly regular layer 
onto existing base lexicons. What is regular is also predictable, hence can be virtualized: one could keep a 
minimized base lexicon and devise an on-the-fly implementation of the add.repeat s regular expression 
operator to add repeat arcs on demand only to the currently pursued base hypothesis. With suitable diacritics 
one could even extend such an approach to languages where exceptions that fail to reduplicate must be taken 
into account. A variant of that approach would add global register-manipulating instructions to choice arcs 
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in the minimized base lexicon, e.g. Beesley (1998)'s unif y-test. Injecting a little bit of memory into 
automata in this way would serve to synchronize choices in the base and reduplicant part of the word form, 
acting as a kind of distributed disjunction. Because the same instruction would be used twice in repetition- 
as-intersection, the use of a (limited kind of) unification is indeed necessary here. It remains to be seen 
whether such a mixed approach would be feasible in practice. 

Also, from the point of view of competence-based grammar, the proposed encoding for redupHcation 
is not as restricted as one might wish, because it can also encode unattested types of reduplication such as 
mirror image copying ww^, a context-free language. Here is the essential part of an actual piece of code 
showing how to do it: 



realize_base := [boundarySymbol, interior_part,boundarYSYmbol] . 
mirror_image_total_reduplication : = 

[realize_base, [repeat, repeat, contentSymbol] *, 
repeat, repeat, boundarySymbol, skip *] . 



When intersecting with an enriched version of a string like [ a,b,c,dj (bracketed with boundarySymbol), 
each character of the mirror-image part [ d , c , b , a ] will be preceded by two repeat symbols. Moreover, 
because we will ultimately have stepped back to the beginning of the string, we finally need to skip over 
the entire base to avoid the realization of an additional conventional reduplicant. This additional effort 
suggest one way to explain the absence of mirror-image reduplication, namely by extending the general 
notion of resource consciousness: traversing an arc carries a certain cost and there will be a maximum cost 
threshold in production and acquisition. Under this view mirror-image reduplication fares significantly 
worse than its naturally occuring competitor because it needs three times the amount of technical arcs. 
Clearly, this sketch of a performance-based explanation of an important gap in natural language patterns 
of reduplication needs to be worked out more fully in future research, but the initial line of attack seems 
promising. 

Another point worth noting is that the repeat-arc encoding is not intimately tied to one-level automata. 
This is to be expected given the fact that transducer composition o can simulate intersection A & B via 
identity (A) o identity (B) .As demonstrated in appendix B, a version of our proposal that works 
for transducers is readily constructed for those who prefer to work in a multilevel framework. 

Finally, one might worry about the apparent need for a greater number of online operations, especially 
runtime intersection. The radical alternative, relying completely on offline computations, seems rather 
unattractive in terms of storage cost and account of productivity for cases like reduplication in Bambara, 
Indonesian, etc. However, we believe that with lazy, on-the-fly algorithms for intersection etc. the runtime 
cost can stUl be kept at a moderate level. This is an interesting area for further experimentation, especially 
when representative benchmarking procedures can be found. 

There is also room for variation with respect to the ideas about resource-conscious automata. First, 
instead of distinguishing between producers and consumers on the level of entire segmental symbol, one 
could employ a finer subdivision into feature-level producers and consumers. So e.g. a segment might 
produce a nasal feature but consume place-of-articulation features deUvered from elsewhere. 

Second, one might play with changing the logic of producer/consumer combination. For example, 
in a stricter setting corresponding to linear logic, resources could be consumed only once and leftover 
unconsumed resources would result in ungrammaticality. One potential application area might be to prevent 
coalescent overlap, signalled by a doubly produced resource at the position of coalescence. However, the 
intuitionistic behaviour embodied in the present proposal seems to fit more naturally with the demands of 
productive reduplication, where base segments are consumed two or even more times. In contrast, linear 
behaviour would require expUcit re-production of consumed resources here. More analyses need to be 
undertaken to find out whether both types of behaviour (or even more) are needed to formulate elegant 
grammars of a wider range of phenomena, or whether a single uniform combinatory logic suffices. 

Finally, there is a possibility that resource consciousness of the kind proposed here may be reducible 
to classical arc intersection via global grammar analysis/transformation. This is not alltogether implausible 
given the fact that LFG's bipartite setup of constraint and constraining equations is now seen as involving 
resource-conscious notions, while at the same time a theoretical reduction of LFG grammars to simpler 
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ones employing only conventional constraint equations (i.e. pure unification grammars) exists. To be 
sure, even if such a reduction would prove to be feasible in our case, the value of resource-consciousness 
as a high level concept would still be unaffected. 

For Bounded Local Optimization, a brief note will do, namely that its local search for minimal-weight 
arcs can profitably be integrated with the ehmination of consumer-only arcs in phase 11 of grammatical 
evaluation. As a result, we now need only one pass over (a fraction of) the automaton, with a subsequent 
gain in efficiency. 

In conclusion, while there are certainly areas in need of future research, the one-level approach to 
prosodic morphology presented in this paper already offers an attractive way of extending finite-state tech- 
niques to difficult phenomena that hitherto resisted elegant computational analyses. 
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A Alternative method: 

Lexical transducer plus intelligent copy module 

Finite- state transducers are currently the most popular implementation device in appUcation-oriented com- 
putational morphology, mainly for reasons of low time complexity, reversibility for generation and parsing 
and the existence of minimization methods. They are used to model both the basic lexicon and its addi- 
tional morphological and phonological variation. However, to date there exists no proposal that can handle 
reduplicative morphology in general, apart from isolated attempts at simpler, rather local instances (cf. a 
Tagalog case modelled by Antworth 1990). In view of the fact that the copy language ww is context sensi- 
tive, some compromise or approximation must obviously be found, at least when the finite-state assumption 
is to be maintained. The following ideas form part of such an approximative solution. 

The first part of the solution separates the copying aspect from finite-state-based lexicon and 
underlying-to-surface variation. 

(44) Separate copy 

copy —>■ Surface form 



Lexicon + Phonology/Morphology (finite-state) 



To preserve regularity, the copy must be made outside of the lexicon + phonology FSA/FST. Since 
there are languages where phonological rules apply to a reduplicated form, the copy should not be done 
before the phonology. Also, while conceptually lexicon and phonology may be separated, in practice they 
are normally composed to form a single lexical transducer (Karttunen 1994). The key advantage of this 
compilation step is that a possibly exponential grow of the rule transducer is prevented in practice through 
the limiting contexts provided by the lexicon. However, having only a single lexical transducer means that 
an intermediate stage is no longer available for copying before applying the phonology. 

Having secured the proper place of a copy module after the lexical FST or FS A, we next need to devise 
a way to control copying. Since we cannot model the nested nonlocal dependencies exhibited by ww-type 
reduplicative constructions directly, let us instead encode a segment-local promise to reduplicate, to be 
fulfilled by the copy module. In (45) we see a first example from German. 

(45) Segment-local encoding of multiple realisations 



surface 


techtelmechtel 


encoding 


segment string 


t e c h t e 1 m 


# copies 


1 2 2 2 2 2 2 -1 



In a nutshell, the copy module now gets complex symbols, which consist of a segment proper and an 
annotation for the number of copies to be made. The module scans a string of complex symbols from left to 
right, outputting each segment that has a nonzero number of copies. By convention, whenever the number 
of copies is negative, a new scan is started after outputting the current segment. The new scan performs 
the same actions as before, except that it first decrements the value of the number of copies when positive 
while incrementing when negative. Thus, in our example we have: 

(46) Stepwise derivation of example (45) 



Scan 


Input Output 


1 


segment string 
# copies 


techtelm 
1 2 2 2 2 2 2 -1 


techtelm 


2 


segment string 
# copies 


techtelm 
1111110 


echtel 
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Note that rescans are only initiated upon encountering a negative number, so that incrementing the last 
occurrence of —1 suffices to stop rescanning. Here then is a more systematic tabulation of proper usage of 
the above encoding (47), followed by pseudocode of the algorithm interpreting it (48). 

(47) NUMBER-OF-COPIES ENCODING SCHEMA 





Scan 1 


Scan 2 


\4l^copies\ 


output 


yes 


yes 


2 


segment 


yes 


no 


1 




no 


yes 






(48) COPY Algorithm 



input string of symbols Input[l . . . N] condition length(Input) > 
ScanPos ^ TempPos ^ LastRescanPos ^ 
AtEnd <— length(Input) 
repeat 

ScanPos ^ ScanPos + 1 
CurrentSymbol ^ Input[ScanPos] 
if CurrentSymbol. NumCopies ^ tlien 
output CurrentSymbol. Segment 

endif 

if CurrentSymbol. NumCopies < then 
Re scan ^ true 
TempPos ScanPos 
ScanPos ^ LastRescanPos 
if CurrentSymbol.NumCopies = -1 then 
LastRescanPos <— TempPos 

endif 
else Rescan ^ false 
endif 

if CurrentSymbol.NumCopies > then 

CurrentSymbol.NumCopies ^ CurrentSymbol.NumCopies — 1 

else 

CurrentSymbol.NumCopies <— CurrentSymbol.NumCopies + 1 

endif 

until ScanPos = AtEnd A Rescan 



It is not difficult to see that the algorithm always terminates. Informally, because every input string is of 
finite length and the number of copies associated with each segmental positions must become positive in a 
finite number of steps (by way of the last if-then-else statement), the number of rescans initated by negative 
number-of-copies annotations will be finite as well. 

What are salient properties of the solution just proposed? It is 

• dynamic, i.e. the surface form of a given input token can be computed by the algorithm, while the 
set of all such surface forms is not statically represented 

• not persistently linearised, i.e. the complete surface linearization of the input cannot be read off the 

input easily using only finite-state mechanisms 

• not uniquely encoded, i.e. there is a countably infinite number of possible input codes for any 
given surface form. The chief source of infiniteness comes from 0-annotated material that is outside 
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of a rescan domain, for example <tie2C2h2t2e2l2m.iXoXoXoXo ... >. In this way one may add an 
arbitrary amount of underlying material that will never be output by the copy algorithm. 

Let us comment on some of these aspects. The dynamic aspect seems unavoidable as long as the finite-state 
assumption is upheld in its most restricted form, i.e. there are no online operations with greater-than-finite- 
state power Uke composition, intersection etc. during parsing and generation. 

The next point about full linearization being performed outside of phonology proper, however, causes 
more concern. One reason is that there are cases where the shape of reduplicants depends on their surface 
position, as we have seen in Koasati (recall the contrast between penultimate heavy syllables in ak-ho- 
lat.lin vs. tahas-too.-pin). Of course this is just an instance of the general fact that phonological alterna- 
tions often depend on surface syllable structure. But those surface syllables may cut through the 'folded' 
encoding of inputs: e.g. <tech.tel.mech.tel> has <mech> as the third syllable, which happens to be nonlo- 
cally represented under the input encoding <tie2C2h2t2e2l2m-i >. Also, the example of Washo shows that 
surface-derived syllable roles may differ between reduplicant and base, with concomitant phonological 
consequences (data taken from Wilbur 1973, 17): 

(49) Surface linearization and Washo reduplication 

w'etwedi 'it's quacking' 

J'upjubi 'he's crying gently' 

tum?s'opsobi 'he's splashing his feet' 

b'akbagi 'he's smoking' 

In the data under (49) we can see that obstruents get devoiced in the coda. Crucially, however, the relevant 
codas are indirectly created by reduphcation: a C1VC2 reduphcant prefixed to the base results in a VCCV 
context that surface-based syllabification resolves by assigning coda-onset roles to the CC cluster. Since 
C2 in the base ends up being in an onset position, it is not devoiced, resulting in the phonological difference 
between reduplicant and base underlined in (49). 

Washo seems to present a problem for our proposed input encoding: when using a representation Uke 
<W2'52<i 2«i> that only covers the underlying base, the independence between base and reduplicant re- 
quired for phonological differences is lost. However, the last aspect about the non-uniqueness of input en- 
codings - which has not been illustrated so far - actually provides hope for such situations. Let us therefore 
have a look at some alternative encodings for reduplicative forms in (50). Example (50). a2 demonstrates 
an alternative, more local encoding for techtelmechtel that brings original III and its modification Iml into 
close proximity. It also allows to syllabify the string /tmechtel/, as if it were a surface string - e.g. using 
a finite-state version of the proposals of Walther (1992) et seq. Unlike (50). a2 the critical Iml will then 
become part of a complex onset, thus correctly receiving the same syllable role as in the linearized surface 
form. The examples (50).bl,2 show that locality vs nonlocality is not the only source of ambiguity in the 
linearization code: original-modification pairs such as <?'iao> may be reversed in reduplicative construc- 
tions without affecting the surface result. In general precedence relationships in the input string are only 
relevant if the corresponding annotations target the same scans. The next set of examples (50).cl-4 shows 
a broad range of options differing in the placement of fixed material. (50). cl is approximately surface-true 
in so far as it has the /oo/ in the penultimate syllable position. However, it simultaneously serves to again 
show the potential for non-surface truth within individual syllable role assignments: /s/ will be assigned an 
onset role, but the surface realization has a coda instead. Example (50). c2 illustrates a different compromise 
in that it sees all of the reduplicative affix as a non-distributed prefix. Example (50). c3 shows a shghtly odd 
analysis that denies reduplicative status to tahastoopin in annotating an input suffix to end up as an infix 
after linearization. Finally, (50). c4 separates reduplicated and fixed-melody material, seeing only the latter 
as suffixed to the input base. The last two encodings share the property of leaving the base uninterrupted, 
which may be advantageous in terms of compression degree of the base lexicon. The last pair of examples 

(50) .dl-2 illustrates how to proceed when syllable roles and segmental realization differ between base and 
copy: instead of incorrectly representing just one token (50).dl, it is sometimes possible to maintain two 
adjacent copies (50).d2 in such a way that the input encoding itself can be correctly syllabified. 

As we have just seen, the encoding is quite versatile. In fact, it is not even limited to describing redu- 
phcative patterns alone but can also handle infixations, circumfixations and truncations. (51) shows a few 
more examples from previous sections. 
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(50) Non-Uniqueness in Encoding of Surface Forms 
a 1 . techtelmechtel t e c h t e 1 m 

1222222-1 
a2. tmechtel 

102222 2-2 

bl . schnickschnack schniack 
2 2 2 2 1 2-2 

b2. schnaick 
2 2 2 2 12-2 

cl. tahastoopin tahasoopin 
2 111-111111 

c2. tooahaspin 
2001 1 1-1 1 1 1 

c3. tahaspintoo 

1 1 1 1 1000 1 1-1 
c4. tahaspinoo 

2 1 1 1 -1 000 1-1 
dl. *wedwedi wedi 

2 2-21 

d2. wetwedi w e t d i 

22-111 



(51) More Encoded Examples 
Encoding Surface 



w u lu o 
2222-1 

c ? e e t 
2 0-2 

silin 
2 1-211 

V e 1 o 
12-21 



wuluowulu 
ctc?eet 
silslin 
velelo 



Encoding 



krandhi 
202200-1 

u m b a s a 
00-1111 

berggete 
0000 1 -1 1 1 



Surface 



kanikrandh 



bumasa 



gebergte 



kKuJtJofi kKuJtJi 
111111001 



The Bella Coola form silslin in (51) shows that straightforward use of the encoding undermines a 
succinct analysis of cases where segmental realizations are the same, although the syllable roles differ 
between a base segment and its corresponding partner in the reduphcant. Assuming the syllabification /.sil.- 
sUn./, the /V is in coda position in the redupUcative prefix (spelled out in scan 1), but in onset position in the 
base (spelled out in scan 2) - contrary to the unique onset position it receives when directly syllabifying 
the input /sihn/. Of course, by introducing a redundant extra /I/ like in the Washo case we could rescue 
surface-Uke syllabfication, albeit at the cost of introducing considerable redundancy and with no other 
motivation but to avoid technical difficulties. (Note that in this case, though, the systematic local copy 
of a single consonant CiCi^copy in each base would be technically feasible using a finite-state rule). To 
conclude, while all the information to ultimately deduce surface position - and thus, surface prosodic 
role - is formally present in the input, it is often difficult to exploit that information. Sometimes at least the 
individual solutions would seem to include construction-specific, non-surface-referring rules for finite-state 
syllabification that preclude maximal modularity and component reuse in analyses. 

Thus far I have described the proposed local encoding method from a generation or production per- 
spective only, where a single string of complex symbols gets spelled out on the surface. It remains to show 
that the method is also usable for parsing or perception, where matching against a finite-state-encoded set 
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of strings is required. Also, finite-state networks are attractive because minimization algorithms exist to 
reduce storage requirements, and it is natural to ask whether these continue to produce good results under 
the new encoding. 

Turning first to the issue of parsing surface forms, let us assume as before that the lexicon takes the 
form of an acycUc FSA or FST, using complex symbols to encode segments. In a loop the parsing algorithm 
would then read a single surface segment and try to match it to the segment symbol of an arc emanating 
from the current state of the network. However, encountering a symbol with zero annotation causes that 
arc to be skipped, repeating the matching attempt with the following arc(s). For various reasons it may 
become necessary to nondeterministically choose which arc from a set of possible arcs should be followed. 
As the linearisation of encoded strings works off a single string only, and furthermore modifies its number- 
of-copies annotations during consecutive scans, we need to copy at least the annotation information of the 
partial path hypothesis that is currendy pursued to a separate buffer. The copying can of course be done 
incrementally. All rescans then will only exame this buffer, which contains the current state of number- 
of-copies annotations, whereas the network itself remains unmodified. Also, the stored path hypothesis 
must be retracted upon backtracking - to be initiated when it turns out that the current hypothesis cannot 
be correct because a segment match fails - so that we actually need a stack of partial path buffers. A match 
failure pops off one entry, whereas each nondeterministic choice records the arc chosen and provides a new 
partial path buffer which initially contains a copy of the previous one. However, upon closer examination 
it becomes clear that recording current scan number and ScanPos actually suffice to uniquely determine 
the state of the algorithm in (48), hence we might trade (recomputation) time for space in this way. Details 
remain to be worked out how to exactly define suitable data structures and the recognition algorithms itself, 
but it seems reasonably clear that this can be done. 

Actually, a second route to take would be to make the copy algorithm (48) itself reversible, instead of 
implementing a separate version for recognition. Borrowing from existing work in Prolog implementation 
it seems sufficient to implement a backtrackable destructive assignment operation on top of the backtrack- 
ing mechanism that is independently needed for traversing nondeterministic networks. Again, this sketch 
obviously needs more detail to faciUtate full evaluation. 

The final issue worth discussing is minimization and storage requirements for networks encoded 
with the proposed method. Obviously, if the same segments carry different number-of-copies annotations, 
they are different when viewed as complex symbols. Without further means this simple fact diminishes 
the chance for sharing of connmon substrings. This is illustrated in (52) using the two German words 
techtelmechtel, technik. 




However, only large-scale experiments with lexicons containing reduplicative constructions can tell 
whether the loss of sharing compared to a naive full-form lexicon is not outweighed in practice by the 
saving that comes with not having to repeat redupUcative parts on the segmental level. 

In the face of these initial difficulties, a natural question to ask is this: can we have the best of both 

'^Note that with very large networks - such as that employed in AT&T speech appUcations -, the network data themselves may 
reside on read-only disk and not fit into main memory at all. Rather, they will be swapped in on demand through memory-mapped 
files. Similarly, networks might be stored in ROM, again necessitating RAM-resident copies to record on-the-fly modifications. 
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worlds, i.e. minirnization results approaching that of non-complex-encoded networks and the space saving 
that potentially comes with the proposed method? To preview, it seems that only approximative solutions 
are possible. A first idea that comes to mind is to linearize the encoding, following each segment with 
the annotation that goes with it, e.g tlm0e2c2h2t2e2l-2?'^ However, it is easy to see that - given a regular 
set S containing only even-length strings - the 'intercalated union' U of the projection O of S' containing 
only the letters at odd positions (e.g. segments Uke tmechtel) with the projection E containing only even 
string positions (e.g. annotations like 1022222-2) is not equivalent to S itself, thus precluding an approach 
that would couple separate offline minimization with parallel online traversal, i.e. alternating between the 
O and E machines depending on siring position. In general U is strictly larger than S, because choices 
in E and O are not synchronized. One could try to develop sophisticated schemes that implement a kind 
of distributed disjunctions to effect synchronization, while simultaneously taking into account that in our 
application \E\ <C \U\. However, in order to decide which synchronized choice to make in E when having 
done a previous choice at a node k <E 0,we in general require knowledge of the entire path p^ o travelled 
so far. Unfortunately, storing such paths and synchronization marks is hkely to become costly again. 

A second approach to improving minimization quality would consist of building a variant of the net- 
work that implements perfect hashing from each word Wi into a natural number / from 1 . . . iV G Af, 
again assuming an acyclic FSA/FST that contains a finite number N of words (see e.g. Daciuk 1998 for 
an implementation, or US patent 5,551,026 granted August 27, 1996 to Xerox). We could then store the 
annotation vector Vi for Wi in a table indexed by i. Unfortunately, this will definitely increase the amount 
of search/backtracking in recognition, as the annotations are crucial in defining surface shape, but will not 
become available incrementally in this approach. It might also not help much in saving storage space unless 
clever compression is employed for the annotation table. 

Another approach sees the annotations as weights in a weighted FSA/FST. Again each word Wi has 
an annotation vector Vi, but now we can view the vectors as being composed from suitable smaller parts 
distributed over the network so as to facilitate maximal compression. For example, weights could be strings 
from (-2|-1|0|1|2)* and the weight-combining operator could be concatenation. There exist minimization 
methods for the weighted case that essentially comprise two phases (Mohri, Riley & Sproat 1996). The 
first phase pushes the weights to the beginning of the FSA/FST as far as possible, thus enabling greater 
similarity - and therefore greater shareability - towards the end. The second phase consists of ordinary 
FSA/FST minimization. This approach then promises to adress the storage issue better than the previous 
one, although it is similar in that again one has to wait till the end of each input string to get the composed 
total weight that drives the inverse copy algorithm. 

Finally, annotations could simply be output strings in an ordinary FST, more precisely a sequential 
transducer, where the associated input automataton is deterministic. Mohri (1994) has described an algo- 
rithm to make such transducers very compact, based on the closely related idea of pushing partial output 
strings (represented as members of an enlarged alphabet) towards the initial state of the FST, and then 
mimimizing this FST in the sense of traditional automata mimimization. Advantages and disadvantages 
of this approach seem to be quite similar to the previous one, except that by indentifying output strings 
with annotations instead of traditional surface strings we have perhaps lost the abiUty to use canonical 
underlying representations with morphological and other annotations. 



Note that this Unearized encoding generalizes to finite-state transducers, where Input : Output pairs can be sequentiaUzed as 
Input s Output to convert the transducer into an ordinary automaton (cf. fig. 9 in US patent 5,625,554 granted to Xerox on April 
29, 1997). Of course, application of underlying and surface strings must then skip odd and even arcs to match and also output the 
following or preceding arc as result. 
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Alternative method: 

Reduplicative copying as transducer composition 

1 %% This code assumes FSA Utilities (http://www.let.rug.nl/~vannoord/Fsa/) 
2 

3 :- op ( 1200 , xf X, ' : =' ) . %% separates name and definition of macro 
4 

5 %% expand new macro notation into FSA Utilities format during file read-in 
6 

7 :- multifile (user : term_expansion/2 ) . 

8 user : term_expansion (':=' (Head, Body ) , macro (Head, Body) ) . 

9 user : term_expansion (':-'(':=' (Head, Body ) , PrologGoals) , 
10 (macro (Head, Body ) :- PrologGoals)). 
11 

12 %% the grammar proper 

13 

14 % add_repeats/l is best defined as manipulation of the underlying automaton: 

15 % for each noncylic arc A — > B, add B — [] /repeat — > A 
16 

17 rx (add_repeats (E) , FA) :- 

18 (fsa_regex:rx(E, FAO)), 

19 (fsa_regex:add_symbols ( [repeat] , FAO, FAl) ) , 

20 ( fsa_data : copy_fa_except (transit ions , FAl, FA, TransitionsO , Transitions)), 

21 findall (trans (From, [] /repeat. To), ( member (trans (To, _, From), TransitionsO), 

22 To \== From ), RepeatTransitions ) , 

23 append (TransitionsO, RepeatTransitions, Transitionsl) , 

24 sort (Transitionsl, Transitions) . 
25 

25 boundary := escape (#) . 

27 epsilon : = [ ] . 

28 technical_symbols := { boundary, repeat }. 
29 

30 enriched_words ( [ ] ) := {}. 

31 enriched_words ( [Word I Words ] ) : = 

32 { add_repeats ( [ epsilon : boundary, word(Word), epsilon : boundary ]), 

33 enriched_words (Words ) }. 
34 

35 % N.B. {} denotes the empty language, [] is the empty string/epsilon . The builtin 

36 % word (Atom) yields an automaton where the characters of Atom are concatenated 
37 

38 reduplicant := [boundary : epsilon, (? - technical_symbols ) *, boundary : epsilon] . 
39 

40 % N.B. ? is the any (meta) symbol, A - B is set difference. 

41 % Regular languages are automatically coerced to relations/transducers 

42 % if necessary (e.g. in the context of composition o) 
43 

44 total_reduplication (WordList ) := 

45 enriched_words (WordList ) 

46 o 

47 [reduplicant, repeat : epsilon *, reduplicant]. 

48 test := total_reduplication ( [orang, utan] ) . 
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